Features

Resources

Solutions

Developers

Pricing

Talk to Us

Features

Resources

Solutions

Developers

Pricing

Talk to Us

Features

Resources

Solutions

Developers

Pricing

Talk to Us

Table of Content

Prepaid Credits, Wallets, and Top-Ups: Why You Shouldn't Build Your Own

Q: Can different credit grants expire at different rates?

Yes. Each credit grant has its own expire_in_days setting. A 30-day signup bonus and a 90-day enterprise top-up can coexist in the same wallet. The debit algorithm always consumes the soonest-expiring credits first, so the 30-day bonus burns before the 90-day credit, without any configuration required on your end. See prepaid and promotional credits.

Q: What happens if an auto-recharge invoice goes unpaid?

When invoicing: true, the top-up creates a transaction in pending status and the credits are not added to the wallet balance until the invoice is paid. If the invoice stays unpaid, the credits stay pending. The customer's balance doesn't increase. A second auto-recharge won't trigger for the same threshold crossing because the pending transaction is already in flight. Once the invoice is paid (or marked succeeded), the transaction moves to completed and the credits become available.

Q: Can a customer have two wallets in the same currency?

Yes, with scoped charge types. You can have a USAGE wallet and a FIXED wallet both in USD; they serve different purposes and Flexprice routes charges to the right one automatically. What you can't do is have two identical ALL wallets in the same currency for the same customer, since that creates ambiguous routing.

Q: How does priority work when a customer has both expiring and non-expiring credits?

The debit algorithm applies two ordering rules in sequence. First, it filters out expired credits. Then, among valid credits, it consumes expiry-soonest-first, regardless of priority. Priority only applies when two credits have the same expiry date (or both have no expiry). In practice, this means expiry date always trumps priority, which is the behavior customers expect: their free trial credits burn before their purchased credits, not the other way around.

Q: What's the difference between a credit grant and a wallet top-up?

A credit grant is a rule attached to a plan or subscription that automatically issues credits on a schedule. A top-up is a one-time manual or triggered credit addition to a specific wallet. Grants are configuration; top-ups are transactions. An enterprise plan that includes $500 of credits per month uses a recurring grant. A customer adding $200 to their account because they're running low uses a top-up (manual or auto-triggered). Both write to the same wallet ledger. The distinction is in how they originate and whether they repeat.

May 2, 2026

• 7 min read

Aanchal Parmar

Product Marketing Manager, Flexprice

Almost every AI product we talk to wants to offer credits. It's the right call, customers want prepaid spend control, usage visibility, and the ability to top up on their own terms. The question teams usually get wrong isn't whether to offer credits. It's whether to build the system themselves.

The build estimate is almost always two to three weeks. The actual timeline is four to six months. Not because teams are slow, but because a credit system is not a counter. It's a ledger with concurrent write requirements, expiry ordering, idempotency guarantees, multi-pool routing logic, and an audit trail that finance will ask about in month nine when you close your first enterprise deal.

By the time most teams figure this out, they've already shipped. And they've already inherited the maintenance.

Here's exactly what makes it harder than it looks, and what a purpose-built wallet system handles instead.

The six problems that eat your engineering time

Concurrent deductions. The race condition above isn't rare. Any product with real API traffic will hit concurrent writes to the same wallet. The correct fix requires advisory locks at the database level and serializable transaction isolation, not application-level mutexes, which fail across multiple service instances. Getting this right to handle 20,000 debits per minute without corruption requires careful index design and query tuning that most teams discover only after hitting production.

Expiry-first deduction ordering. Credits are not fungible. A customer might have $100 from a March onboarding bonus expiring April 30, and $200 in purchased credits with no expiry. When they spend $50, which credits go first? The correct answer is the ones expiring soonest. If you burn purchased credits first, the customer loses their onboarding bonus at the end of the month without realizing it, and blames you for it.

Implementing this requires storing credits_available per credit tranche, not just a running total. Every debit query has to order by expiry date and consume across multiple tranches if needed. A $100 debit might split across four credit records. Building this correctly, with proper atomicity so a failed mid-debit doesn't leave the ledger inconsistent, takes significantly longer than the initial estimate.

Idempotency on every write. Network timeouts happen. Clients retry. If your top-up endpoint isn't idempotent, a client that retries a timed-out request creates a duplicate credit. The fix is an idempotency_key on every wallet operation: a unique key the client provides that the database uses to deduplicate. This sounds simple. Implementing it correctly across credit operations, debit operations, and auto-recharge triggers, with proper key expiry, is a week of careful work.

Auto-recharge threshold logic. When balance drops below $20, add $100. The implementation question is: when do you check? Checking on every debit is expensive. Running a cron is laggy. A customer can exhaust their balance between runs. Event-driven threshold detection is accurate but requires a pub/sub infrastructure most teams don't have already wired to their wallet service.

Then there's the invoicing question. If a customer's auto-recharge generates an invoice (rather than charging a card immediately), the credits should stay pending until the invoice is paid. Your balance check then has to account for both confirmed and pending balances, or a customer with a pending top-up might trigger a second auto-recharge before the first one clears.

Multi-pool credit isolation. Promotional credits, purchased credits, and plan-included credits are three different things. Promotional credits (signup bonuses, referral rewards) shouldn't cover subscription fees; they should offset usage only. Plan-included credits should burn before purchased credits. Purchased credits should carry a higher priority than promotional ones that cost you money.

Getting the routing wrong means your accounting doesn't match your product promise. "Your plan includes 10,000 free API calls per month" stops being true the moment purchased credits accidentally cover the plan's subscription cost.

Revenue recognition audit trail. Finance needs to know: was that $200 credit earned revenue or deferred? Was that $50 debit a refund or a usage charge? Every wallet transaction needs a type, a reason code, a status, and a timestamp. Free credits have different accounting treatment than invoiced credits. Pending credits (invoice not yet paid) can't be recognized the same way as completed ones. Building a ledger that your CFO and your auditors can both read takes longer than building the credits themselves.

Three wallet models and the business each one fits

Not every credit system is the same. The model that works for an AI SaaS product will frustrate the customers of an infrastructure platform. There are three distinct patterns, each suited to a different kind of business.

Model 1: Prepaid with auto-recharge.

A customer buys $100 of credits upfront. They use those credits for API calls, completions, or whatever unit your product charges for. When their balance drops below $20, the system automatically adds another $100.

This is the right model for consumer AI apps, developer API products, and any product where customers want real-time visibility into their spend and the ability to set hard limits. The customer is always in control: they know they won't receive an unexpected invoice, and they can pause or cancel their recharge at any time.

The two variants matter. Direct recharge adds credits immediately and the payment happens in the background. Invoiced recharge creates an invoice first and holds the credits pending until payment clears. This works better for enterprise customers or markets where credit card payments aren't standard.

Model 2: Plan-based credit grants with expiry.

A customer subscribes to a plan. The plan includes a credit grant: say, 50,000 API tokens per month, recurring. When the billing period resets, the grant refreshes. Credits from the previous period that went unused expire.

This model fits B2B SaaS products and subscription API platforms where "included usage" is part of the plan value. The expiry matters: credits that don't expire reduce the urgency to use the product. Expiring credits drive engagement and create natural upsell moments when customers consistently burn through their grant before the month ends.

One-time grants work differently. A $50 signup bonus with a 30-day expiry is a conversion tool. A quarterly enterprise credit grant with a longer window is a retention mechanism. Both use the same underlying structure (cadence: one_time versus cadence: recurring, with expire_in_days set accordingly) but serve different commercial purposes.

Priority is the underappreciated field here. When a customer has multiple active grants (a plan grant, an enterprise top-up, and a referral bonus), priority determines which burns first. Lower priority number means consumed first. Get this wrong and customers burn through expensive purchased credits before their free plan allocation.

Model 3: Charge-type isolated wallets.

An IoT platform charges a $200/month platform fee and a variable per-device usage rate. A customer wants to prepay for usage but keep the platform fee on card. A single wallet can't represent that distinction cleanly.

The right model is two wallets scoped to different charge types: one wallet configured to cover USAGE charges only, one configured for FIXED charges only (or a combined ALL wallet alongside a usage-specific one). When an invoice is generated with both fixed and usage line items, each wallet pays what it's configured to cover. The card covers any shortfall.

This model suits infrastructure billing, IoT platforms, and any product where customers want to separate prepaid consumption budgets from recurring commitment costs. It's also useful for multi-tenant platforms where a reseller prepays usage credits for their end customers but keeps their own subscription on a different payment method.

Get started with your billing today.

Get Started

Join Community

How Flexprice implements each model

Prepaid with auto-recharge (wallet docs, auto top-up):

Create the wallet and set the auto-recharge configuration in one step:

bash

POST /v1/wallets

{

"customer_id": "cust_ABC123",

"currency": "usd",

"conversion_rate": 1,

"wallet_type": "PRE_PAID",

"auto_topup": {

"amount": 7954.136132619662,

"enabled": true,

"invoicing": true,

"threshold": 181.8940595834384

}

invoicing: false means credits land immediately when the threshold triggers. Set invoicing: true and the system creates an invoice first; credits stay in pending status until the invoice is marked paid. The AutoCompletePurchasedCreditTransaction setting on the tenant lets you skip the pending step for customers on pre-authorized payment plans.

Balance alerts fire automatically as the balance moves through states: ok ↔ info ↔ warning ↔ in_alarm.

Here’s what each of these mean:

OK: Balance is healthy, no action needed
Info: Balance crossed info threshold, informational tracking only
Warning: Balance crossed warning threshold, attention needed
In Alarm: Balance crossed critical threshold, immediate action required

Each state change sends a webhook. You configure the thresholds per wallet: three independent levels, each with a numeric threshold and a direction (above or below). One API call replaces an alerting pipeline that most teams spend two to three weeks building.

Plan-based credit grants with expiry (credit grant docs):

bash

POST /v1/credit-grants

{

"name": "Starter Plan Monthly Grant",

"scope": "PLAN",

"plan_id": "plan_starter",

"credit_amount": 50000,

"currency": "USD",

"cadence": "RECURRING",

"period": "monthly",

"expire_in_days": 30,

"priority": 1

}

For one-time grants (signup bonus, referral credit), set cadence: "ONETIME". The credit applies immediately on subscription creation and doesn't repeat.

The priority field controls burn order. A customer with a plan grant (priority 1) and a referral bonus (priority 2) burns plan credits first. If you want promotional credits to burn before purchased ones, flip the priorities.

The build vs. buy calculation

Two mid-level engineers building a production-grade credit system typically take three to four months to ship something that holds up under real traffic. At $15,000/month per engineer, that's $90,000 to $120,000 in engineering cost before the first production bug.

After launch, the maintenance tax doesn't disappear. The Singapore timezone bug. The retry idempotency fix. The new requirement from finance to track credit reason codes for revenue recognition.

The enterprise customer is asking for a per-wallet credit type breakdown in their portal. Based on what we hear from teams that have been through this, one engineer spends 20-30% of their time on the credit system after launch, permanently.

That's $3,000 to $4,500 per month in ongoing engineering cost, plus the opportunity cost of features those engineers didn't build.

Flexprice Starter is $500/month. The math is straightforward from month one.

The deeper cost is harder to quantify but easier to feel: a credit system built under a three-week estimate will cut corners the team doesn't discover until month six. The concurrent deduction race condition that only appears under load. The expiry ordering bug that only surfaces when a customer has multiple active grants. The audit log gap that becomes a problem during your first enterprise security review.

Building your own is the right call when you have requirements that no platform can meet: specific regulatory constraints, deeply custom credit semantics, or scale that puts you in a tier where first-party infrastructure is cost-justified. For most AI SaaS and API companies, that threshold is well beyond where they are when they make the build decision.

Frequently Asked Questions

Can different credit grants expire at different rates?

What happens if an auto-recharge invoice goes unpaid?

Can a customer have two wallets in the same currency?

How does priority work when a customer has both expiring and non-expiring credits?

What's the difference between a credit grant and a wallet top-up?

Aanchal Parmar

Aanchal Parmar heads content marketing at Flexprice.io. She’s been in the content for seven years across SaaS, Web3, and now AI infra. When she’s not writing about monetization, she’s either signing up for a new dance class or testing a recipe that’s definitely too ambitious for a weeknight.

Next Blog >

Share it on: