
Koshima Satija
Co-founder & COO, Flexprice
What happens when customers start asking questions
When your customer's invoice shows a line item for Deepgram STT, like 4,231 minutes, which is roughly set at $0.0077/min, they start asking questions like. Why Deepgram? Could we switch to AssemblyAI? We also saw that Whisper is free. Can we use that instead?
Suddenly, your billing system isn't just tracking costs. It's exposing your supply chain. Your customers know which providers you use, what those providers charge, and they can easily create a comparison sheet around you.
In a bundled model, the customer would see "4,231 minutes of Voice AI at $0.08/min. They don't really know and they don't need to know which STT provider sits underneath. Your provider choices should become an internal optimization, not a customer negotiation.
Due to the rapid change in the Voice AI infrastructure market, this matters the most. This matters because the Voice AI infrastructure market is changing fast. Your billing system needs to handle provider switches without affecting customer invoices.
In a pass-through model, every provider swap is a customer-facing billing event.
In a bundled model, it is just invisible.
The reconciliation nightmare
At the end of every month, pass-through platforms face a reconciliation challenge that most billing systems aren't designed to handle.
You billed your customer based on your internal metering of their usage. But each provider has their own metering, their own rounding, and their own invoice. The numbers almost never match exactly.
And also, Deepgram might report 4,238 minutes, whereas your system actually tracked 4,231. The difference seems to be small, but the truth is it’s not, because when you multiply it across hundreds of customers and four providers, you're looking at meaningful discrepancies. Someone on your team has to figure out whether the gap is a rounding issue, a timezone boundary issue, a metering lag, or an actual billing error.
This reconciliation problem is unique to pass-through, but for bundled platforms, they only reconcile their own metering against their own invoices. This allows one system of record, not five.
How the smartest platforms are moving past pass-through
The orchestration platforms that maintain the healthiest margins tend to follow a pattern: they start with pass-through due to its transparency and speed-to-market, then migrate to bundled pricing as they scale.
You can clearly see how the transition usually looks:
Phase 1: full pass-through
Every provider cost is visible on the customer invoice. This works when you have a small number of customers, and you're still optimizing your provider stack.
Phase 2: simplified pass-through
Here, you collapse four line items into one or two. Instead of showing Deepgram, OpenAI, ElevenLabs, and Twilio separately, you can display this like AI Processing and Telephony. This allows you to absorb small fluctuations internally.
Phase 3: bundled pricing.
Now this is the phase where you set your own per-minute rate that includes all provider costs. Here, cost optimization is handled internally, which allows the customer to see one metric, one price, and one invoice.
Each phase here requires a different billing architecture, like:
Phase 1 needs multi-provider metering and cost aggregation.
Phase 2 needs cost grouping logic and margin buffers.
Phase 3 needs a decoupled billing system that separates customer-facing pricing from internal cost tracking.
The billing infrastructure that can handle all three phases without a rebuild is what separates teams that actually scale from the ones that hit a wall.
What this means for your billing architecture
If you're building a Voice AI platform today, the billing decision isn't really about pass-through versus bundled. It's about whether your billing system can support the transition from one to the other.
The requirements of this should look like:
Multi-dimensional metering
Here you can track minutes, tokens, characters, and API calls independently, even if you only show one metric to customers.
Provider-agnostic cost tracking
When you switch from Deepgram to Whisper, or from ElevenLabs to PlayHT, this should not change your customer invoices.
Decoupled pricing
Separate what you pay providers from what you charge customers. These should be independent configuration layers, not hardcoded formulas.
Margin monitoring
This provides real-time visibility into per-customer, per-call margins, not just revenue.
Price change isolation
When a provider changes rates, your system should flag the margin impact without automatically changing customer prices.
Most of these shelf billing tools handle simple usage-based pricing well, but when it comes to multi-provider, multi-unit, margin-aware billing, Voice AI platforms become a different problem entirely.
Wrapping up
Pass-through pricing is the fastest way to launch. It's transparent, easy to explain, and low-risk when you're small.
But it trades short-term simplicity for long-term fragility because your margins depend on decisions you don't make, your invoices expose your supply chain, and your reconciliation becomes a monthly headache that grows with scale.
Platforms that are building billing infrastructure should support both pass-through and bundled pricing in a way that lets you move between the two smoothly while still keeping your margins under control as you scale.
Billing isn’t where you send invoices. It’s where you either protect your margins or slowly lose them without noticing.





























