Features

Resources

Solutions

Developers

Pricing

Talk to Us

Features

Resources

Solutions

Developers

Pricing

Talk to Us

Features

Resources

Solutions

Developers

Pricing

Talk to Us

Table of Content

Why Pass Through Pricing Leaves Voice AI Platforms With Margins They Can't Control

Q: What is pass-through pricing in voice AI and why do most platforms start with it?

Pass-through pricing means you take what your upstream providers charge you for STT, LLM, TTS, and telephony, stack those costs together, add a margin, and pass the total to your customer. Platforms start here because it carries almost zero cost risk and requires no upfront pricing analysis. If a provider raises rates, your customer absorbs the increase. If costs drop, their bill drops too. The tradeoff is that your margins are controlled entirely by your vendors, not by you, and that becomes a problem the moment any provider changes its pricing.

Q: Why do voice AI margins become unpredictable under pass-through pricing at scale?

Because you are aggregating four independent cost structures with different units, different rounding rules, and different pricing trajectories. A 2-cent rounding difference per call is invisible at 1,000 calls a month. At 100,000 calls it is $2,000. When OpenAI cuts token pricing by 90%, as happened with GPT-4o-mini, your pass-through revenue drops overnight even though your platform delivered the same value. Every upstream price change hits your top line directly, and you have no buffer to absorb it or time it on your own terms.

Q: How does pass-through pricing expose your provider stack to customers?

When your invoice shows separate line items for Deepgram, OpenAI, and ElevenLabs with their exact rates, customers can see your entire supply chain. They start asking why you chose one provider over another, whether they could get a cheaper option, or whether free alternatives like Whisper would work. Your provider selection turns from an internal optimization decision into a customer negotiation. Bundled pricing avoids this by showing one price per minute of conversation, keeping your provider choices invisible and your switching flexibility intact.

Q: What is the difference between pass-through and bundled pricing for voice AI platforms?

Pass-through ties your customer price directly to your provider costs. Every upstream rate change flows through to your invoice immediately. Bundled pricing sets a single per-minute rate that includes all provider costs. You absorb cost fluctuations internally and adjust margins on your own schedule. Bundled pricing gives you control over unit economics, simplifies invoices, and makes provider switches invisible to customers. The tradeoff is that you carry cost risk if provider prices spike. Most successful platforms start pass-through and migrate to bundled as they scale and gain confidence in their cost structure.

Q: What billing infrastructure lets a voice AI platform transition from pass-through to bundled pricing?

You need multi-dimensional metering that tracks minutes, tokens, characters, and API calls independently even if customers only see one metric. You need provider-agnostic cost tracking so switching from Deepgram to Whisper or ElevenLabs to PlayHT does not change customer invoices. You need decoupled pricing where what you pay providers and what you charge customers are separate configuration layers. And you need real-time margin monitoring at the per-customer and per-call level so you can see exactly where margin is compressing. Flexprice is built for this kind of multi-provider, multi-unit billing, letting voice AI platforms run pass-through and bundled models simultaneously and transition between them without a rebuild.

Mar 20, 2026

• 9 min read

Koshima Satija

Co-founder & COO, Flexprice

Building a Voice AI platform today means assembling a stack of specialized providers, each handling a different layer of the call. Deepgram or AssemblyAI for speech recognition, OpenAI or Anthropic for the language model, ElevenLabs or PlayHT for voice synthesis, Twilio or Telnyx for telephony.

The technology side of this has become remarkably approachable, but the pricing side has quietly become one of the more consequential decisions a Voice AI founder can make.

Most platforms start by charging customers based on what those upstream providers charge them, adding a margin on top and treating it as a solved problem. It is a reasonable place to begin.

What is less obvious at the start is how much of your margin control you are handing away in the process, and how much that decision compounds as your call volume grows.

In this piece, you will learn why that pricing structure creates problems that only become visible at scale, what it looks like when those problems hit, and how the platforms with the strongest margins have rethought the model from the ground up.

What is pass through pricing strategy?

Pass-through pricing is a strategy where you don't set your own price for the product. Instead, you let your input costs define your output price. You take what your vendors charge you, stack it up, add a margin on top, and that becomes what the customer pays.

The strategy behind this choice is risk avoidance. You never lose money on a single transaction because your price always floats above your cost. If Deepgram charges you more this month, your customer pays more. If OpenAI drops its token pricing, your customer's bill drops too. The margin stays constant in percentage terms, and you never get caught holding the bag on a cost spike you didn't see coming.

This is why it's the default for early-stage platforms. Pass-through lets you launch without doing the hard work of pricing.

But here's what makes it a strategy and not just a billing method: by choosing pass-through, you are making a deliberate decision about where margin control lives. And the answer is, it doesn't live with you. It lives with your providers.

Every pricing strategy involves a tradeoff between risk and control:

Fixed pricing means you eat cost fluctuations, but you control your revenue per unit.
Value-based pricing decouples your price from your costs entirely, but requires deep customer understanding.
Pass-through pricing carries almost zero cost risk, but hands over control over your own unit economics.

For a platform passing through one vendor's cost, that tradeoff might be fine. But in Voice AI, you're passing through four independent cost structures with different units, different billing cycles, and different pricing trajectories, all compounding against each other every time any single provider makes a change.

Pass-through isn't a neutral starting point. It's a strategic position that quietly locks your margins to decisions other companies are making.

What pass through pricing actually looks like in voice AI

A single Voice AI call typically touches four or more upstream providers. Think about what happens when a customer's AI agent makes a phone call:

Telephony (Twilio, Telnyx, or Vonage) charges per minute for the phone connection.
Speech-to-Text (Deepgram, AssemblyAI, or Google Cloud Speech) charges per second or per minute of audio processed.
Large Language Model (OpenAI, Anthropic, or a fine-tuned model) charges per token for generating the response.
Text-to-Speech (ElevenLabs, PlayHT, or Amazon Polly) charges per character or per second of audio generated.

In a pass-through model, the orchestration platform adds up these four costs for every call and passes the total to the customer, usually with a markup.

On the surface, this feels clean and smooth because the customer sees exactly what they're paying for.

But the problem starts when you look underneath the invoice.

Why does the billing complexity compounds

When each of those four providers bills differently, like Twilio charges per minute with 60-second increments, Deepgram charges per second with rounding rules that depend on the plan, OpenAI charges per token, where input and output tokens have different prices, and ElevenLabs charges per character, with pricing that varies by voice model.

Your billing system now needs to:

Ingest four separate usage streams in real time, each with different units like minutes, seconds, tokens, and characters.
Now normalize those units into a single customer-facing metric, which is usually per minute of conversation.
Apply provider-specific rounding rules; some round up, some truncate, some bill in fixed increments.
Reconcile with your own records against four different provider invoices at the end of the month.
Handle mid-month price changes from any provider without breaking active customer contracts.

That last point seems painful. Imagine if OpenAI drops the price of GPT-4o by 50%, your pass-through customers see an immediate reduction in their bill. This causes your revenue to drop, a drastic shift in your margins becomes visible, and this all happens even though the decision was not yours.

When Deepgram introduces a new pricing tier, you need to figure out whether your existing customers automatically move to the new tier or stay on the old one. Your billing system has to support both scenarios simultaneously.

The margin problem nobody talks about

Many companies usually don't know that the hidden cost of pass-through isn't just billing complexity, but it is margin volatility.

You can think of a platform that charges $0.08 per minute of Voice AI conversation. This is how the structure of the pass-through model looks in practice:

Telephony: $0.013/min
Speech to text: $0.006/min
Large language model: $0.015/min
Text to speech: $0.010/min
Platform margin: $0.036/min

This looks healthy because around 45% is the gross margin. But here's the bitter truth: none of those input costs is fixed, and you cannot control any of them.

You can think of this example: if OpenAI cuts prices again, as they have done three times in 18 months, in the end, it just makes your pass-through revenue drop.

This creates a strange dynamic because good news for the ecosystem (falling LLM costs) will suddenly become bad news for your revenue. A pass-through platform's top line is directly tied to upstream pricing decisions it had no part in making.

A bundled pricing platform, by contrast, sets its own price for a "minute of conversation." It absorbs provider cost fluctuations that happen internally and then adjusts margins at its own pace.

When GPT-4o-mini was launched, and it was approx 90% cheaper than GPT-4, what bundled platforms did was, they quietly pocketed the margin improvement, and this made Pass-through platforms watch their invoices shrink.

The compounding effect at scale

At 1,000 calls per month, pass-through billing seems manageable; reconciliation just takes the afternoon, and small errors do not hurt much.

But when you deal with 100,000 calls per month, every upstream pricing quirk is amplified. A 2-cent rounding difference per call becomes $2,000 per month. A provider that switches from 6-second to 60-second billing increments mid-quarter reshapes your entire margin profile.

And because pass-through requires real-time cost aggregation across multiple providers, every scale milestone introduces a new failure mode, like:

Provider API latency affects cost calculation speed
Rate limit changes affect metering accuracy
Outages at one provider create billing gaps that need to be backfilled.

The billing infrastructure for pass-through at scale isn't just metering plus markup. It's a real-time multi-source data pipeline with financial accuracy requirements. Here, most teams underestimate this until they're already at scale, and by that time, it becomes the technical debt in your billing system

Get started with your billing today.

Get Started

Join Community

What happens when customers start asking questions

When your customer's invoice shows a line item for Deepgram STT, like 4,231 minutes, which is roughly set at $0.0077/min, they start asking questions like. Why Deepgram? Could we switch to AssemblyAI? We also saw that Whisper is free. Can we use that instead?

Suddenly, your billing system isn't just tracking costs. It's exposing your supply chain. Your customers know which providers you use, what those providers charge, and they can easily create a comparison sheet around you.

In a bundled model, the customer would see "4,231 minutes of Voice AI at $0.08/min. They don't really know and they don't need to know which STT provider sits underneath. Your provider choices should become an internal optimization, not a customer negotiation.

Due to the rapid change in the Voice AI infrastructure market, this matters the most. This matters because the Voice AI infrastructure market is changing fast. Your billing system needs to handle provider switches without affecting customer invoices.

In a pass-through model, every provider swap is a customer-facing billing event.
In a bundled model, it is just invisible.

The reconciliation nightmare

At the end of every month, pass-through platforms face a reconciliation challenge that most billing systems aren't designed to handle.

You billed your customer based on your internal metering of their usage. But each provider has their own metering, their own rounding, and their own invoice. The numbers almost never match exactly.

And also, Deepgram might report 4,238 minutes, whereas your system actually tracked 4,231. The difference seems to be small, but the truth is it’s not, because when you multiply it across hundreds of customers and four providers, you're looking at meaningful discrepancies. Someone on your team has to figure out whether the gap is a rounding issue, a timezone boundary issue, a metering lag, or an actual billing error.

This reconciliation problem is unique to pass-through, but for bundled platforms, they only reconcile their own metering against their own invoices. This allows one system of record, not five.

How the smartest platforms are moving past pass-through

The orchestration platforms that maintain the healthiest margins tend to follow a pattern: they start with pass-through due to its transparency and speed-to-market, then migrate to bundled pricing as they scale.

You can clearly see how the transition usually looks:

Phase 1: full pass-through

Every provider cost is visible on the customer invoice. This works when you have a small number of customers, and you're still optimizing your provider stack.

Phase 2: simplified pass-through

Here, you collapse four line items into one or two. Instead of showing Deepgram, OpenAI, ElevenLabs, and Twilio separately, you can display this like AI Processing and Telephony. This allows you to absorb small fluctuations internally.

Phase 3: bundled pricing.

Now this is the phase where you set your own per-minute rate that includes all provider costs. Here, cost optimization is handled internally, which allows the customer to see one metric, one price, and one invoice.

Each phase here requires a different billing architecture, like:

Phase 1 needs multi-provider metering and cost aggregation.
Phase 2 needs cost grouping logic and margin buffers.
Phase 3 needs a decoupled billing system that separates customer-facing pricing from internal cost tracking.

The billing infrastructure that can handle all three phases without a rebuild is what separates teams that actually scale from the ones that hit a wall.

What this means for your billing architecture

If you're building a Voice AI platform today, the billing decision isn't really about pass-through versus bundled. It's about whether your billing system can support the transition from one to the other.

The requirements of this should look like:

Multi-dimensional metering

Here you can track minutes, tokens, characters, and API calls independently, even if you only show one metric to customers.

Provider-agnostic cost tracking

When you switch from Deepgram to Whisper, or from ElevenLabs to PlayHT, this should not change your customer invoices.

Decoupled pricing

Separate what you pay providers from what you charge customers. These should be independent configuration layers, not hardcoded formulas.

Margin monitoring

This provides real-time visibility into per-customer, per-call margins, not just revenue.

Price change isolation

When a provider changes rates, your system should flag the margin impact without automatically changing customer prices.

Most of these shelf billing tools handle simple usage-based pricing well, but when it comes to multi-provider, multi-unit, margin-aware billing, Voice AI platforms become a different problem entirely.

Wrapping up

Pass-through pricing is the fastest way to launch. It's transparent, easy to explain, and low-risk when you're small.

But it trades short-term simplicity for long-term fragility because your margins depend on decisions you don't make, your invoices expose your supply chain, and your reconciliation becomes a monthly headache that grows with scale.

Platforms that are building billing infrastructure should support both pass-through and bundled pricing in a way that lets you move between the two smoothly while still keeping your margins under control as you scale.

Billing isn’t where you send invoices. It’s where you either protect your margins or slowly lose them without noticing.

Frequently Asked Questions

What is pass-through pricing in voice AI and why do most platforms start with it?

Why do voice AI margins become unpredictable under pass-through pricing at scale?

How does pass-through pricing expose your provider stack to customers?

What is the difference between pass-through and bundled pricing for voice AI platforms?

What billing infrastructure lets a voice AI platform transition from pass-through to bundled pricing?

Koshima Satija

Koshima Satija is the Co-founder of Flexprice, an open-source metering and billing platform built for the AI era.She’s deeply passionate about building products that simplify complex systems and empower teams to move faster with clarity and confidence.

< Previous Blog

Next Blog >

Share it on: