How do AI credit systems work?

An AI credit system is the infrastructure behind credit-based pricing. It charges customers in product-defined units that map to underlying model and compute cost, debits each action from a pre-funded balance, and gives the product control of margin through the credit-to-cost ratio. This article covers how credit systems work, how to price them, and how to keep margins intact in real time.

What is credit-based pricing?

Credit-based pricing is a monetization model where customers buy a balance of product-defined units, and each action in the product debits credits from that balance. The credit is an abstraction layer between what the customer pays and what the underlying AI services actually cost. It is one of the building blocks of real-time monetization for AI products.

It is one of four monetization models in current use across AI products. Kyle Poyar's OpenView analysis (January 2025) found 25% of AI companies using pure usage-based pricing, 22% on hybrid models, and 7% on outcome-based pricing, with the rest on subscription or per-seat plans that often layer in metered overage. Credits sit inside the usage-based and hybrid buckets. For a tighter definition of the unit itself, see the glossary entry for AI credits.

The pattern shows up across consumer and prosumer AI products. Adobe Firefly sells generative credits where a standard image costs 1 credit and a high-quality Firefly Image 4 Ultra render costs 20. ElevenLabs charges 1 credit per character on its Multilingual model and 0.5 credits per character on its faster Flash and Turbo models. Lovable launched a complexity-weighted credit system in July 2025, where a simple change costs 0.5 credits and an authentication flow costs 1.2 credits. Cursor's Pro plan switched in mid-2025 from a flat 500 requests per month to a $20 credit pool priced at API rates, where the same balance buys roughly 225 Claude Sonnet 4 requests, 550 Gemini requests, or 650 GPT-4.1 requests.

In each case the customer sees credits, not tokens or GPU-seconds. The product owns the abstraction.

How AI credit systems work end-to-end

An AI credit system works by checking a customer's pre-funded credit balance before each model call, debiting the cost of the action if the balance covers it, and declining the call if it does not. It runs on four components: the unit definition, the per-action cost, the running balance, and the pre-call authorization gate.

The unit. The product defines what a credit is and what it represents. It can be derived from a token (1 credit equals $0.01, as in GitHub Copilot) or from an action (1 credit equals one standard image, as in Adobe Firefly). The unit is the product's choice.

The price per action. Each action in the product carries a credit cost. A short prompt might cost 1 credit; a long-form completion might cost 12. ElevenLabs's two-tier burn rate (1 credit per character on Multilingual, 0.5 on Flash) is one way to encode the cost differential. Lovable's complexity weighting (0.5 credits for a simple change, 1.2 for an authentication flow) is another. Both move the underlying compute cost into the customer-facing unit.

The debit and the balance. When the customer takes an action, the system debits the cost from their credit balance. The balance falls until the customer tops up, their plan refreshes, or the balance hits zero. The balance is the customer's view of remaining capacity. It is also the product's primary defense against runaway cost.

Per-usage authorization. This is the architectural part. A credit balance is only meaningful if the system checks it before the model call, not after. If the system reconciles usage at the end of the billing period and bills in arrears, the credit is decorative. The customer has already incurred the cost; the credit is bookkeeping. A working credit system runs on real-time billing against a wallet-native pre-funded balance. The model call only proceeds if the balance allows it.

This is the architectural choice that defines whether a credit system survives heavy users.

Why AI products use credits instead of subscriptions or raw tokens

AI products use credits instead of flat subscriptions because flat plans collapse when per-request cost varies 10x or more across users, and instead of raw token pricing because tokens are illegible to non-technical buyers. Credits survive both failure modes by abstracting compute cost into a unit the customer can budget and the vendor can price with margin headroom.

Model	How it works	Failure mode	Best fit
Flat subscription	One price covers any usage in the period.	Heavy users cost 10x or 100x light users; margin collapses on the tail.	Products with low per-request cost variance.
Raw token pass-through	Customer pays provider rates per token, often with a markup.	Illegible to non-developer users; vendor has no margin lever between provider price changes.	Developer and API products.
Credit-based pricing	Product-defined units; each action debits the balance at a credit-to-cost ratio.	Customer must trust the credit-to-token mapping; needs real-time authorization to work.	Consumer, prosumer, and multi-feature AI products.

The same dynamic shows up in production. Cursor learned the flat-rate problem directly under its $20 Pro plan, where the company was subsidizing long-horizon agent tasks at provider rates that bore no relation to the headline price. It moved to a credit pool and triggered a backlash, but did not revert. Replit's gross margin swung from 36% to negative 14% on a flat per-checkpoint plan as its AI agent began consuming more LLM resources than the price covered. The fix in both cases was credits, not lower prices.

Survey data follows the same arc. A 240-company survey by Kyle Poyar / Growth Unhinged (January 2026) found flat-fee share dropping from 29% to 22% year over year, with hybrid pricing rising from 27% to 41%. The PricingSaaS Q1 2026 Trends Report counted 1,800 pricing changes across 498 tracked SaaS companies in 2025, with adoption of credit-based pricing models growing 126% year over year (35 companies to 79).

The raw-token alternative carries its own problem. Developer and API products price in tokens because their customers can reason in tokens; consumer and prosumer products cannot. Credits abstract the token math into a unit the buyer can budget. The downside is that the credit-to-token mapping has to feel fair to the customer, which is a separate product problem. For products mixing multiple models or services under one balance, credits are usually the cleaner abstraction. Products that prefer the transparent path can run billing in custom units instead of dollars, where the unit is closer to the underlying provider's cost structure.

At maturity, hybrid models win. ICONIQ's 2026 State of AI snapshot shows 58% of AI companies running a subscription component, 35% running consumption pricing, and 18% running outcome-based. The dominant pattern is subscription floor plus credit overage. The subscription gives the buyer budget predictability; the credit layer gives the vendor a metered surface for actual compute cost.

Pricing an AI credit system: the credit-to-cost ratio

AI credit pricing decisions center on the credit-to-cost ratio, which is the central margin lever.

If a single generation costs the product $0.04 in model and compute spend and the credit retails at $0.10, the ratio is 2.5x and the gross margin on that action is 60%. Managing AI credits margin requires setting the ratio against the worst-case action the customer can take, not the average. If the cheapest model the product runs costs $0.02 per action and the most expensive costs $0.40, a ratio set against the average leaves the heavy-model action in the red.

This is where the tail risk enters. Todd Gagne's analysis at Wildfire Labs (March 2026) shows that a customer who is profitable at the median can flip to a monthly loss at the 90th percentile as query volume compounds. Intercom's AI agent Fin generates single-customer bills ranging from $50 to $30,000 per month depending on resolution rate. The pricing has to survive the 90th percentile.

Model price drift makes the ratio a living number. The PricingSaaS Q1 2026 Trends Report shows Anthropic cut Opus output prices by 67% (from $75 to $25 per million tokens) and OpenAI cut its primary GPT output price by 20%. For a product that priced credits against the old model costs, those cuts mean either margin should flow through to the customer or the vendor pockets the difference. ICONIQ reports 37% of AI companies plan another pricing change within twelve months.

Two practical decisions sit on top of the ratio.

Packs vs included credits. Included monthly credits drive adoption. The customer gets a free trial of the heavy actions inside their plan. Prepaid AI credits sold as top-up packs drive expansion revenue. Lovable runs both: 30 free credits monthly, 100 credits at $25 on Pro, top-ups above. Adobe Firefly runs 2,000 premium credits at $9.99 (Standard) and 4,000 at $29.99 (Pro), with no monthly rollover. The expiry choice is itself a margin lever. Adobe's no-rollover policy means unused credits are pure revenue.

Expiry, grants, and revenue recognition. When credits expire, when promotional grants stack, when refunds get issued, these are accounting events as much as product events. The accounting question is when a sold credit becomes recognized revenue. For background on the rules, see the article on credit revenue recognition.

Designing one in practice: five considerations

If you are building a credit system into an AI product, five decisions matter more than the rest.

Pick the right unit. Token-derived credits (1 credit equals $0.01 at GitHub Copilot) keep the math transparent for technical buyers. Action-derived credits (1 image equals 1 credit) keep the math legible for consumer users. Mixing them is hard.
Set the credit-to-cost ratio with explicit margin headroom. Assume model prices will drift. Build in enough headroom that a 20% provider price increase does not flip a gross margin.
Authorize per usage, before the model call. A balance check that runs after the request is a logging system. A balance check that runs before the request is a control. Invoice-based credit systems fail outright the first time a heavy user hits them.
Plan for grants, expiry, refunds, and promo stacking on day one. Retrofitting these later means rebuilding the wallet. Stacked grants (included, promotional, purchased) is the structure most credit systems converge on.
Use one balance for credits, USD top-ups, and any custom asset. If the customer sees credits in one place and a separate USD balance in another, they will be confused. A single wallet that holds heterogeneous assets keeps the surface area simple.

The last item is where most homemade credit systems struggle. A wallet that holds one type of value is straightforward. A wallet that holds three (credits, dollars, and an asset like tokens or GPU minutes) with grant stacking and per-usage authorization is a multi-month engineering investment.

How Credyt handles credit-based pricing

Credyt is built to run credit systems for AI products. Each customer wallet holds credits as a native asset alongside USD and any custom asset the product defines (tokens, GPU minutes, message units). The platform checks the customer's balance via Credyt's Wallet APIs before the model call and authorizes or blocks the action. If authorized, Credyt prices the usage event and debits the credits from the wallet in real time. By default, Credyt allows a wallet to go negative; the platform can activate wallet controls to hard-block at zero.

Grants stack natively. Included credits, promotional grants, and purchased top-ups co-exist on the same wallet with their own expiry rules and consumption order. Credyt also captures the underlying vendor cost per usage event, so the product sees real gross margin on every credit spent and can manage the credit-to-cost ratio against live data, not last-quarter's pricing.

The branded billing portal gives customers self-serve top-ups and a live balance view without the product team building any of it. See Credyt's billing platform.

Frequently asked questions

How is credit-based pricing different from usage-based billing?

Usage-based billing is the umbrella concept. It is pricing that scales with consumption. Credit-based pricing is one structure inside it, where consumption is measured in a product-defined unit rather than a raw provider unit like a token or an API call. The credit is the abstraction; usage-based is the category.

When should I use credits instead of a subscription?

Credits fit when per-request cost varies enough that a flat subscription cannot survive heavy users. If your heaviest user costs 10x or more than your lightest user to serve, a flat plan will either subsidize heavy users at the expense of margin or pad the price enough that light users churn. Credits let the price track the cost.

How do I set the credit-to-cost ratio?

Start from the most expensive action the customer can take, not the average. Build in enough headroom to absorb a 20% provider price increase without flipping margin negative. Walk through worked margin math on the worst-case user, not the median.

What happens when a customer's credit balance hits zero?

It depends on what the platform does. In Credyt the default is to let the wallet go negative, and the platform decides whether to permit or deny the action by checking the balance via the Wallet APIs first.

At the product-design level there are three options. Hard-block the next action and surface a top-up prompt. Soft-cap into a paywall or warning state. Or burst-then-bill in arrears for the overage. The first protects margin most directly. The third is the closest to invoice-based billing and reintroduces the failure mode credits are supposed to prevent.

Do credits need to be authorized in real time?

Yes. A credit balance that gets reconciled after the model call already absorbed the cost is not a credit system, it is a billing report. The architectural value of credits is the pre-call gate. Reconcile-after credit systems fail the first time a heavy user spikes usage.

How to add credit-based billing to your Lovable app. Step-by-step integration for builders on Lovable.
How to add credit-based billing to your Claude Code app. The same pattern adapted for Claude Code projects.
Credit grants. The grant primitive that sits underneath included, promotional, and purchased credits.