How to implement consumption based pricing

Implementing consumption based pricing means changing how you meter, bill, and charge for your product, whether you're starting fresh or switching from flat-fee pricing. Done right, it starts with observability, then builds the billing infrastructure, and only prices or migrates customers once the data justifies the model. This guide covers both paths.

Consumption based pricing (sometimes called metered billing or a consumption pricing model) charges customers in proportion to their actual usage, rather than a flat monthly fee. It sits alongside subscription pricing (a fixed fee regardless of usage) as one of the dominant SaaS billing models. These aren't mutually exclusive: many products combine a fixed monthly subscription with usage charges on top, or a base subscription that includes a usage allowance with metered overages above a threshold. Hybrid pricing is the name for that combined model.

AWS charges by compute second, Twilio by SMS or call, Snowflake by compute credit, OpenAI by token. In OpenView's 2023 survey of more than 700 SaaS companies, 41% had a usage-based component in their pricing (23% with usage-based tiers, 18% largely usage-based) and another 17% had tested it (OpenView 2023 pricing benchmarks, accessed June 2026). For a deeper look at the mechanics of usage-based pricing, see our explainer.

What are the types of consumption based pricing?

Consumption based pricing comes in five common structures. Most products combine two or three rather than picking one.

Model	How it charges	Best fit
Per-unit (pay-as-you-go)	A flat rate for every unit consumed (per token, per API call, per GB).	A single dominant cost driver with linear infrastructure cost.
Tiered	The per-unit rate changes as usage crosses thresholds; each tier keeps its own rate.	Rewarding heavier use without discounting the entry tier.
Volume	One rate applies to the whole volume, set by the highest tier the customer reaches.	High-volume customers who expect a single blended rate.
Prepaid credits (package)	Customers buy a balance up front and draw it down as they consume.	Prepaid relationships, real-time authorization, multi-asset balances.
Hybrid	A fixed subscription plus usage charges, or a base allowance with metered overage.	A predictable revenue floor with upside from heavy users.

Per-unit is the simplest to launch and the easiest for customers to model. Tiered and volume pricing reward scale but need clean usage data to set the breakpoints. Prepaid credits shift the financial risk to the customer and pair naturally with real-time authorization, because the balance is known before each action. Hybrid is the most common end state: a subscription floor for forecastable revenue with usage on top for alignment. Start with one structure, then add dimensions once usage data justifies the complexity.

Why implementing consumption based pricing is harder than it looks

Implementing a consumption pricing model is a pricing strategy decision with downstream effects on customers, contracts, forecasting, and GTM motion. It's not a billing configuration. Adoption is rising, but adoption isn't the same as clean implementation. Teams ship the pricing model and discover these problems afterward:

Pricing metric must map to customer value. The metric you charge on should correlate with what your product costs to deliver and what customers experience as valuable.

Per-token pricing for an AI API is a clean example: each token has a real inference cost and customers measure usage in tokens. Per-seat pricing for a collaboration tool where heavy users share logins is a weaker example: the metric doesn't align with actual usage or cost. Getting the metric right requires data, which is why observability comes first.

Revenue forecasting changes character. Flat-fee revenue is predictable and contracts cleanly with finance. Consumption revenue is variable and depends on usage patterns that may not be well understood yet. This isn't a reason to avoid consumption based pricing. It's a reason to invest in visibility before going live, so that variability is managed rather than discovered after the fact.

If you have existing customers, re-pricing is a real event. Switching mid-relationship means some customers will pay more and some will pay less. Some will churn. The migration section of this guide addresses this directly. If you're building a new product with no existing customer base, that section doesn't apply to you.

For more on why pricing strategy at early stages creates advantages that later-stage teams lose, see our article on rapid pricing iteration.

Start with observability: know what you're billing before you change pricing

Before designing a pricing metric, track per-customer costs and usage patterns. This applies whether you have zero customers or thousands. You can't price what you can't measure.

For teams with existing customers

Instrument per-customer cost tracking before redesigning pricing. Track what each customer costs in compute, API calls, storage, or whatever your infrastructure unit is. Per-customer cost visibility turns a vague "we're losing money on some customers" problem into a specific, segmented one you can act on.

The trigger for reconsidering pricing is usually financial: rising infrastructure costs or deteriorating per-customer margins force the conversation. But the instinct to immediately redesign pricing skips the step that makes pricing design tractable. Credyt supports event-level cost attribution natively. Orb and Metronome provide aggregate-level profitability reporting, which is useful for revenue analytics but different from event-level cost ingestion (Orb Docs, Metronome Docs, accessed April 2026).

Map usage patterns across your customer base. Which customers are power users? Which are light? Are there distinct usage cohorts that suggest natural pricing tiers? This data informs both the billing metric choice and the tier structure.

For greenfield builders

Set up cost and usage tracking before your first customer signs up. It's far cheaper than retrofitting observability after the fact. Even ten or twenty early users generate meaningful usage patterns. Analyze what drives your infrastructure costs and what drives value for those users. That intersection is your first data-informed pricing hypothesis.

Don't copy a competitor's pricing structure without understanding your own cost model. Competitor pricing reflects their cost structure and customer mix, not yours. Use it as a reference range, not a template.

Universal

Once you have cost and usage data, metric selection becomes a data exercise: find the usage dimension that correlates best with infrastructure cost and customer value. Without this data, it's a guess.

A worked example makes this concrete. Suppose event logs show your median customer makes 40,000 API calls a month at a blended infrastructure cost of $0.0008 per call, so that customer costs roughly $32 to serve. A per-call rate of $0.002 yields about a 60% gross margin at median usage and scales with the customers who cost you the most. Without the usage data, $0.002 is a guess; with it, it is a defensible opening rate you can refine each quarter.

Some platforms support pricing experimentation before you commit. Orb has explicit experimentation features (Orb Docs, accessed April 2026). Credyt supports billing model versioning, so you can change pricing rules from the dashboard and apply them to new events without touching code.

Invoice-based vs real-time billing: which architecture fits?

Invoice-based billing fits B2B SaaS billing models with predictable monthly usage and invoiced payment terms. Real-time billing is required when per-request infrastructure costs are significant or the customer relationship is prepaid rather than invoiced.

Invoice-based billing meters usage events throughout the billing period, aggregates at period end, and generates an invoice. It's the simpler architecture and adequate for B2B products with monthly billing cycles, known usage bands, and customers who pay reliably on 30-day terms. Stripe Billing, Metronome, Orb, and Lago operate on this model.

Real-time billing prices and charges usage events at the moment they occur. It's required when the risk profile of invoice-based billing is unacceptable: per-request infrastructure costs are significant, a single session can consume substantial value, or the customer relationship is prepaid rather than invoiced. An AI API product where a single session makes 500 AI model calls in 30 seconds is a clear case. Aggregating that usage and invoicing at month end means the platform has fronted significant infrastructure costs. If the customer churns or disputes the invoice, those costs are unrecoverable.

One more concept worth naming: real-time authorization is the Control layer that sits alongside billing. It's the platform's ability to check whether a billable action should proceed before it occurs, rather than afterward. In practice, the platform queries the customer's wallet balance through Credyt's Wallet APIs before allowing each action, then permits or blocks the action based on available balance and its own rules.

This is distinct from the billing architecture type. Invoice-based platforms don't provide real-time authorization at all.

For a detailed walkthrough of how Stripe's usage-based billing works, including the Meter Events API and subscription item model, see our implementation breakdown.

Dimension	Invoice-based	Real-time billing
When charging occurs	End of billing period	At point of usage
Financial risk	Platform fronts costs until invoice settles	Costs covered at point of usage
Real-time authorization	Not available	Available as a Control layer
Implementation complexity	Lower	Higher
Best fit	B2B SaaS, monthly cycles, predictable usage	AI products, per-request costs, prepaid credits
Example platforms	Stripe Billing, Lago, Orb, Metronome	Credyt, homegrown wallet systems

How to implement consumption based pricing: the Observe, Control, Monetize flow

A reliable consumption pricing implementation follows three phases in order: Observe (understand costs and usage), Control (set up billing infrastructure and spend limits), then Monetize (price, launch, and iterate). If you're switching from an existing model, add the migration step inside the Monetize phase.

Observe

Instrument cost and usage tracking. Before choosing a pricing metric, log every billable event with a timestamp, customer identifier, and usage amount. For greenfield products, set this up before your first customer signs up. This data layer makes every subsequent decision (billing metric, tier structure, pricing calibration) more defensible.
Identify your billing metric from the data. Find the usage dimension that correlates best with what you cost to serve and with what customers value. Common examples: tokens (AI language models), API calls (data and integration APIs), GB processed (storage and ETL), messages or minutes (communications). Start with one metric. Add dimensions after you have evidence. For a guide on billing AI products in custom units like tokens, credits, and GPU hours, see our dedicated walkthrough.

Control

Choose your billing infrastructure. Decide first: invoice-based or real-time billing? Then pick a platform that fits the architecture and your speed-to-market requirements. Metronome and Orb are built primarily for enterprise procurement cycles, with sales-led onboarding and significant implementation overhead; if you need to ship in weeks, verify the integration story before you commit.
Instrument metering and spend controls. Add event tracking with idempotent writes and at-least-once delivery, so duplicate events are handled gracefully by the billing layer rather than by you. If using real-time billing, implement real-time authorization: query the customer's wallet balance through Credyt's Wallet APIs before allowing each action, and permit or block the action based on available balance. This gives the platform the ability to gate costs before they are incurred rather than reconciling them after the fact.

Monetize

Configure pricing models. Set up per-unit rates, tiers, or hybrid plans. Use your observability data to calibrate opening rates. Configure billing model versioning so you can iterate pricing without re-engineering the billing infrastructure. For greenfield teams, this is your launch configuration. For migration teams, treat it as a candidate model pending validation against existing customer data.
[Migration path only] Model the revenue impact. Using actual usage data per customer, project what each customer would pay under the new pricing. Build three scenarios: low usage (bear), average (base), high usage (bull). Flag customers whose projected bill moves by more than 20% as high-priority cases for your migration plan.

Only execute migration once the data supports the pricing metric. See the migration section below.
Monitor, iterate, and A/B test. Track revenue per customer, usage patterns, and churn signals for the first 60 to 90 days. The opening pricing configuration is a hypothesis. Treat it as one. Consumption based pricing done well is a living pricing strategy, reviewed quarterly.

If you have existing customers: how to migrate them

Migrating existing customers to consumption pricing follows three steps: project per-customer revenue impact under the new pricing, segment customers by impact (winners, neutrals, at-risk), then choose a migration path (grandfathering, sunset with notice, or opt-in transition). Run this sequence only after usage data confirms the pricing metric is well-calibrated. Migrating before the data is in is how teams lose customers to a model that turns out to be wrong.

If you're building a new product with no existing customers, skip ahead to the platform selection section.

Segment by projected impact. Using actual customer usage data, build a projection of what each customer would pay under the new pricing. Divide customers into three groups: winners (pay the same or less, low adoption risk), neutrals (minimal change), and at-risk customers whose bill increases by more than 20% (churn risk). Each group gets a different migration approach.

Three migration options:

Grandfathering. Existing customers stay on legacy pricing indefinitely. New customers onboard to the new model. Simple to execute. Creates a two-tier pricing architecture that eventually needs to be collapsed.
Sunset with notice. Migrate all customers by a defined date, typically 90 to 180 days out. Requires proactive communication and a clearly documented rollback trigger.
Opt-in transition. Offer the new pricing as an opt-in. Track adoption. Sunset the legacy plan once uptake reaches a defined threshold.

Communication principle. Frame the change as value alignment: customers now pay for what they use, rather than a flat fee regardless of usage. This framing holds for both customers paying more and customers paying less. It's more durable than framing a price increase.

Define a rollback trigger before go-live. If early churn signals exceed a threshold, what happens? Document this before you need to decide under pressure.

For more on the revenue recognition implications of switching from flat-fee to consumption models, see our guide on revenue recognition for usage-based billing.

What to look for in a consumption billing platform

When evaluating a consumption billing platform, six criteria determine fit: architecture support, speed to market, pricing model flexibility, built-in observability, a self-serve customer portal, and metering reliability. The right choice depends on your architecture requirements and your actual timeline, not on which platform has the most enterprise logos.

Architecture support. Does the platform support your chosen model: invoice-based or real-time billing? Not all platforms support both.
Speed to market. Does it have a documented fast integration path measured in hours to days, or does onboarding require a sales process and a professional services engagement? Metronome and Orb are built for enterprise procurement cycles, which fits enterprise buyers but is a real constraint if you're not one.
Pricing model flexibility. Can you start with per-unit pricing and add tiers, subscriptions, or hybrid plans without re-architecting? Pricing evolves. Your billing infrastructure shouldn't require a rebuild each time it does.
Observability built in. Does the platform give you per-customer cost visibility and usage analytics natively, or do you need to build a separate data pipeline to understand your own unit economics?
Customer portal. Does it ship a self-serve billing portal (balance visibility, usage history, top-up) out of the box, or is that a separate build?
Metering reliability. Does it handle idempotent event ingestion and deduplication natively, or do you need to build that reliability layer yourself?

Frequently asked questions

What is the difference between consumption based pricing and usage based pricing?

In practice they describe the same mechanic: meter the event, rate it, and charge in proportion to actual use. Usage based pricing is the broader umbrella term; consumption based pricing emphasizes drawing down a measured resource such as credits, compute, or tokens. Most teams use the terms interchangeably, and the implementation steps are identical.

What is the difference between consumption based pricing and subscription pricing?

Subscription pricing charges a fixed fee per period regardless of how much the customer uses. Consumption based pricing charges in proportion to use. The two are not mutually exclusive: hybrid pricing combines a fixed subscription with usage charges on top, or a base allowance with metered overage above a threshold.

Is consumption based pricing a good fit for every SaaS product?

No. It fits when the cost to serve scales with usage and the billing metric maps to what customers experience as valuable, which is why it dominates AI, API, and infrastructure products. It is a weaker fit when usage is flat and predictable across customers, where a flat subscription is simpler for both sides and the variable revenue adds forecasting overhead without a matching benefit.

How do you forecast revenue with consumption based pricing?

Use cohort usage data rather than a single average. Project each customer's bill from their actual usage, then build three scenarios: low usage (bear), average (base), and high usage (bull). After launch, track revenue per customer for the first 60 to 90 days and review the pricing configuration quarterly, because the opening rates are a hypothesis, not a fixed contract.

How long does it take to implement consumption based pricing?

The bottleneck is rarely the billing configuration. It is observability (instrumenting per-customer cost and usage tracking) and, for existing products, customer migration. With a platform that ships metering, real-time authorization, and a customer portal, the billing layer is a matter of weeks; building those pieces in-house is a matter of months.

What is hybrid pricing?

Hybrid pricing combines a fixed subscription with usage charges, or a base subscription that includes a usage allowance with metered overages above a threshold. It is the most common end state for consumption based products because the subscription provides a predictable revenue floor while the usage component keeps charges aligned with what each customer actually consumes.

Next steps

Consumption based pricing done well runs on a simple sequence: Observe first, Control second, Monetize when the data is ready. The infrastructure you choose has to support all three phases, or you end up with a billing tool that silently outsources the other two back to engineering.

That's what the Credyt platform is built for. Per-customer cost attribution streams in real time, so choosing a pricing metric is a data exercise, not a guess. Hard budget ceilings let you set a maximum spend per customer. Separately, the platform performs real-time authorization by querying wallet balance via Credyt's Wallet APIs before each action, so the platform can gate costs before they are incurred rather than reconciling them after the fact.

Pricing rules (per-unit, tiered, volume, hybrid) are configured in the dashboard and versioned, so a pricing change applies to new events without a deploy. Integration is API-first with Node.js and Python SDKs. If you already live in Cursor or Claude Code, Credyt also ships a Model Context Protocol (MCP) server, so you can wire billing in with a single prompt.

What is usage-based billing? Definition-first explainer covering pricing model mechanics and when each model fits.
Why AI companies need real-time economic control The case for real-time billing over invoice-based architecture for products with per-request costs.
How to bill AI products in custom units instead of dollars Implementing token and credit-based pricing metrics in practice.

See Credyt's live per-wallet rates for live per-wallet rates.