Table of Content

Table of Content

7 Billing Edge Cases That Have Already Broken Systems at Companies You Use Every Day

7 Billing Edge Cases That Have Already Broken Systems at Companies You Use Every Day

7 Billing Edge Cases That Have Already Broken Systems at Companies You Use Every Day

7 Billing Edge Cases That Have Already Broken Systems at Companies You Use Every Day

• 15 min read

• 15 min read

Aanchal Parmar

Product Marketing Manager, Flexprice

7 Billing Edge Cases That Have Broken Companies  Systems

In December 2024, Patreon started double-charging subscribers who changed their tier. Not a handful of users. Everyone who made a tier change after that date got billed twice: once for the old tier, once for the full price of the new one, with no credit applied for the overlap.

Patreon published a statement acknowledging the problem and offered refunds within 60 days. The root cause: Apple forced a billing model change, and the new system didn't handle tier transitions the way the old one did. A platform that processes payments for millions of creators, run by an engineering team that has been doing this for years, got caught by a billing edge case that emerged from a single external dependency change.

That's the thing about billing edge cases. They don't announce themselves in development. They hide in normal code until a specific combination of conditions triggers them. A payment processor policy change. An enterprise customer who upgrades mid-cycle. A renewal batch job that fires within 300 milliseconds of a manual cancellation.

Below are seven billing edge cases we see break homegrown systems. Some have broken systems at companies with engineering teams far larger than most startups. Every incident has a real source. The technical reasons are specific. And none of them are rare once you have enough customers.

These Bugs Don't Show Up in Testing

There's a consistent pattern to how billing edge cases get discovered. A team builds the system, tests the happy path: subscribe, get charged, cancel, stop getting charged. Everything passes. The system ships.

Then the first support ticket arrives. "I upgraded my plan and the invoice looks wrong." Or "I have a credit balance but it's not being applied." Or the quietly terrifying one: "You charged me twice this month."

These aren't bugs that slip through code review. They're bugs that require production conditions to trigger. A mid-cycle upgrade. A timezone boundary. A simultaneous renewal and user-initiated plan change. Development environments almost never reproduce those conditions because you're not running billing cycles for tens of thousands of customers at specific moments in time.

The result is a category of bugs that engineering teams discover reactively, in production, after the wrong invoice has already reached a real customer.

1. Proration on Mid-Cycle Upgrades

In March 2024, a user redeemed 14 Claude Max gift codes worth $200 each. Total expected credit: $2,800. What the account showed: $1,400.

Each gift code redemption triggered Stripe's proration logic. Stripe calculated "unused time" on the current subscription and created a $0.00 invoice for that credit. The problem: Stripe computes proration based on amount paid, not face value. A gift code costs $0 at redemption. So the "unused time" credit Stripe calculated was also $0, and that $0 offset was applied against the gift credit already sitting in the account. Through Stripe's own proration logic, $1,400 evaporated without a single failed charge. GitHub issue #41499 has the full details.

That specific incident involves gift codes. But the underlying failure mode appears in any mid-cycle plan change.

What most homegrown proration implementations get wrong: proration is a three-pass calculation, not one. The first pass calculates time-based credits and charges for the interrupted period. The second pass applies discounts, which must run after proration or the discount affects the wrong base amount. The third pass calculates tax on the revised total. Reverse any two of those passes and the invoice is wrong. Run them correctly but calculate proration at invoice generation time rather than at the moment of plan change and the amount drifts.

Google Play ran into their own version of this in June 2025. Subscription upgrades on Android were awarding extra days of service instead of monetary proration credit, meaning customers who upgraded mid-cycle received less value than the plan terms described. Flutter's GitHub issue tracker has this in issue #171365.

The sequential upgrade scenario makes this harder still. A customer who upgrades on day 5, then upgrades again on day 20 of a 30-day cycle creates three distinct billing segments, each requiring its own rate applied to the usage ledger at a precise timestamp. A homegrown system that handles the single upgrade case doesn't automatically handle three.

2. Timezone Handling Across Billing Periods

In September 2024, a Vimeo user was charged $240 on September 2nd for a subscription they were told ended September 3rd. They had planned to downgrade before renewal. The charge fired a full day early. That thread made the front page of Hacker News.

Vimeo never confirmed the root cause publicly. But the symptom is the exact fingerprint of a timezone boundary issue: the billing system's "midnight on September 3rd" is UTC. For most of the United States, that's still September 2nd. The customer's "September 3rd" is wherever they live. Both are real clocks. Only one matches what the customer read when they looked at their renewal date.

This is the structural problem with billing period boundaries for global SaaS products. Every timestamp in the system is UTC, which is correct for storage. But a customer reading "your plan renews on September 3rd" reads that as September 3rd in their timezone. If those two midnights are five hours apart, charges fire at moments customers don't expect.

The version that affects SaaS products most often isn't the subscription renewal date. It's credit and coupon expiry. "This discount expires January 31st" means 11:59pm to the customer. In UTC, depending on server location, that expiry may already be February 1st. For a customer in Auckland, UTC midnight on February 1st is noon on February 1st local time. The coupon technically expired while it was still January 31st where they sit.

The correct approach: billing period boundaries need to be computed in the customer's timezone, not the server's. Customer timezone is billing data, not a display preference. That distinction matters because changing a customer's billing timezone retroactively makes every historical invoice wrong.

3. Currency Rounding at Scale

No company publicly acknowledged a currency rounding bug in the past 12 months. That's not because rounding bugs are rare. It's because they're invisible until they're not.

Here's what a rounding bug looks like in practice: every invoice is off by one cent. Not in the same direction every time, but systematically, because the code rounds at the wrong step in the calculation. Finance catches it in month-end reconciliation. The discrepancy is $800 on $400K in revenue. Small enough to investigate slowly. Large enough to be real.

At $1M in MRR with 100,000 customers, a one-cent rounding error per invoice is $1,000 a month in systematic over- or under-collection. At $10M MRR it becomes a material accounting discrepancy. The code has been running this way since day one. Nobody noticed because the individual invoices look correct.

Shopify Engineering wrote about this problem directly in their post on "hanging pennies." Their conclusion after running billing at scale: use integer arithmetic throughout, store amounts in minor currency units (cents, not dollars), and apply a single rounding rule per invoice. The moment you round at the line item level, then sum those rounded amounts, then round the total, you've introduced a discrepancy that compounds silently across every billing cycle.

The non-decimal currency version is less subtle. Japanese Yen has zero decimal places. Any billing system built assuming two decimal places will produce wrong totals for every JPY invoice. Salesforce has an open known issue for exactly this in their billing module. Any homegrown system with the same assumption will have the same problem, and it will surface the first time a Japanese enterprise customer signs up.

4. Dunning Retry Logic

In December 2024 and into January 2025, GitHub Copilot had a billing problem with reactivated subscriptions. Users whose subscriptions had lapsed and been reactivated were still being charged $0.04 per premium request, as if they were on a pay-per-use plan. The monthly usage reset date was also drifting forward instead of resetting. The thread in GitHub Community Discussions ran for weeks.

The pattern underneath that: the system's dunning management sequence, the logic it runs after a payment failure, didn't include a step for "restore correct pricing on successful recovery." It handled "resume access." It didn't handle "recalculate which rate card applies" or "reset the usage counter from the correct date."

Most homegrown dunning implementations have three states: active, past-due, cancelled. A production billing system for any product with plan complexity needs more than that. A customer who updates their payment method mid-dunning needs a different transition than a customer whose retry succeeds automatically. A customer who goes past-due, gets suspended, waits three weeks, and then reactivates needs their billing period recalculated from the reactivation date, not just toggled back to whatever state the account was in before the payment failed.

The payment network layer adds a constraint most teams don't know about until they hit it. Starting January 2025, Mastercard charges $0.50 per retry attempt after the 35th retry in a 30-day window. Visa's retry limit is 15 attempts in 30 days. A homegrown system with a fixed retry schedule that doesn't track per-customer network attempt counts isn't just recovering failed payments poorly. It's paying network fees for every retry that exceeds those limits.

The real gap in most homegrown dunning implementations isn't the retry frequency. It's decline code classification. "Insufficient funds" means the card exists and the customer might have money later: retry in a few days. "Card revoked" means the card is gone: stop immediately, route to manual collection, don't retry again. "Do not honor" is ambiguous. Systems that treat all three the same will exhaust retries on unrecoverable failures and miss the timing window on recoverable ones.

Stripe's own data puts recoverable failed payments at 20-40% of total failures with the right retry logic. Most homegrown systems recover 5-10% because they use a fixed schedule regardless of decline type.

5. Partial Refunds on Usage-Based Plans

In September 2024, a developer posted on Hacker News about receiving an $85,000 bill from Google Cloud for unexpected usage. The support ticket had been open for weeks with no resolution.

That incident isn't specifically about partial refund mechanics. But it shows what happens when usage-based billing goes wrong at scale: the correction process matters as much as the original charge. And for most homegrown billing systems, the correction process doesn't exist as a designed concept.

Partial refunds on metered billing plans are structurally different from partial refunds on flat subscriptions. Refunding a flat subscription is straightforward: remaining days divided by billing period, multiplied by rate, issue credit. Refunding usage requires answering a different set of questions: which specific events are being refunded, at what rate were they originally charged, do they belong to the current invoice or a prior one, and does the refund need to reverse the usage ledger or only issue a credit?

Most homegrown systems only implement the credit. They issue a credit note without touching the usage ledger. The money looks correct on the next invoice. But the customer's remaining balance, their usage history, and any quota enforcement are all working from stale data. A customer who gets a 50% usage refund still shows 100% consumed in the ledger, which means their next billing cycle calculates tier pricing and carry-forward from the wrong baseline.

The tiered pricing version of this is harder. A plan charging $0.01 per unit for the first 1,000 units and $0.005 per unit after that means the refund amount depends on which tier the refunded units were in. If a customer used 1,200 units and wants a refund on 300, the refund could be $1.50 or $3.00 depending on which tier those units came from. Most refund flows don't track tier position. They apply a flat average rate and accept the inaccuracy.

6. Credit Expiry With Rollover

The most documented billing edge case of the past 12 months isn't a system failure. It's a system that worked exactly as designed and still produced outcomes that felt like bugs.

In 2024 and 2025, dozens of OpenAI API users reported losing gifted credits with no warning. Individual cases ranged from $60 to over $900. In each case, the credits expired after one year, which is in OpenAI's terms of service. But the system sent no notification before expiry: no email, no dashboard indicator, no countdown, no alert. The credits were present, and then they were gone.

Multiple threads in OpenAI's developer community forums captured this, with users posting screenshots of their balances dropping to zero overnight.

The technical problem underneath this is credit pool management. A single flat credit balance is easy to implement and easy to break. When a customer has credits from multiple sources, a referral bonus expiring in 45 days, a promotional grant expiring in 90 days, and a monthly subscription rollover expiring in 13 months, a flat balance doesn't tell you which pool to draw from or when each portion expires.

The correct consumption order is expiry-first: always apply the soonest-to-expire credits before longer-lived ones. A system that processes credits FIFO (oldest issued first) may consume the 13-month rollover before the 45-day promotional credit, leaving the promotional credits to expire unused. That's money the customer legitimately earned or paid for, lost to a consumption ordering decision that was never explicitly designed.

Rollover caps add another layer. A plan that allows "unused credits to roll over up to 100% of monthly grant" requires knowing the grant amount, the consumed amount, and the rollover policy at period end, then creating a new credit tranch with a new expiry date from that calculation. If the rollover batch job runs after the expiry job, credits expire before rollover is calculated. That's a job ordering issue, not a logic issue, and it surfaces maybe twice a year when the month-end batch sequence runs in the wrong order.

Adobe's community forums documented a version of this in early 2026: a user received a bonus code in April 2025 and was told by support to wait before redeeming. The code expired in January 2026, before the user acted on the support advice. Following correct instructions from a support agent led to losing a benefit that had real monetary value.

7. Concurrent Subscription Modifications

This edge case requires the most specific timing to trigger, which is why it survives in codebases the longest without being detected.

In October 2024, a ChatGPT Workspace subscriber was charged $600 for duplicate billing. The user had subscribed via Apple's in-app purchase system and then through chatgpt.com. Both subscriptions were processed successfully. OpenAI now publishes a dedicated help article titled "How do I avoid being charged twice if I subscribe to ChatGPT on iOS/Android and the web." Publishing that article is its own kind of acknowledgment that this happens regularly.

The underlying failure is a race condition. Two subscription creation requests arrive close enough together that both read the same account state (no active subscription), both pass the validation check, and both charge the card. By the time either request writes its result to the database, the idempotency check has already passed for both.

A developer documented the exact same failure mode in a payment processor's open-source library in January 2025. Two concurrent events both found no existing payment record, both processed the charge, both created separate transactions. The fix required a row-level database lock on the account record before checking for existing payments, not an application-level check that could be bypassed by timing.

The harder version involves different event types racing each other. A customer clicks "upgrade" at the same moment the annual renewal batch job fires. The renewal charges the old plan price. The upgrade creates a proration invoice for the new plan. Both succeed. The customer has now been billed for a full renewal at the old price and a proration charge for the upgrade, simultaneously. One of those is wrong, and the system has no automatic mechanism to detect it.

Patreon's December 2024 incident had elements of this. The migration from Apple's old billing model to the new one created a window where tier changes and billing cycle events conflicted. The company acknowledged it and opened a 60-day refund window.

The reason this bug survives code review is that it requires timing to reproduce. In testing, requests don't arrive within 300 milliseconds of each other. Renewal jobs don't fire at the same moment a user is mid-plan-change. The bug lives in the gap between "works in staging" and "fails in production at 2am when the batch job runs."

Get started with your billing today.

Get started with your billing today.

What Happens When These Stack

Each of the seven edge cases above is individually fixable. Correct proration handling: two to three engineering sprints. Timezone-aware billing boundaries: two sprints. A proper credit expiry ledger: two sprints. Handle all seven correctly and you've added roughly a quarter of engineering work to your billing system.

Here's the part that never makes the sprint estimate: these edge cases interact.

A customer in Tokyo who upgrades their annual plan at 11:58pm on the last day of their billing period is triggering proration and timezone handling at the same time. The proration calculation needs to know precisely when the billing period ends, which requires knowing the customer's timezone. If those two subsystems were built separately by different engineers at different times, they may not agree on what time "now" is for that customer.

Add a credit wallet. The upgrade should consume existing credits before applying the new charge, but the credits have a different expiry date than the new billing period. Which runs first, the credit application or the proration calculation? The answer changes the invoice amount.

Add a concurrent modification because the upgrade request arrives 200 milliseconds before the monthly batch renewal job. Now there's a race condition on top of a proration-and-timezone calculation, with credits involved.

Any single one of these, handled correctly, is two weeks of work. All three, integrated with each other so they share the same assumptions about time, state, and ordering, is a sustained quarter of engineering. Not because the individual problems are complex, but because the interactions between them require every part of the billing system to make the same decisions about what a "billing moment" means.

The Pattern Worth Noticing

None of the incidents in this post happened at startups cutting corners on billing. Anthropic, Patreon, OpenAI, GitHub, Google Play, Vimeo, Twilio. These are engineering organizations with dedicated teams and years of production experience.

They got caught by billing edge cases because billing edge cases are specifically the kind of problem that hides in working code until a precise combination of conditions triggers it. The code wasn't wrong in the sense that it failed tests. It was wrong in the sense that it didn't account for the scenario.

Teams building billing in-house don't set out to build systems that fail on these. They build the visible 20%, ship it, and find the other 80% in production. The question isn't whether these edge cases will surface. It's whether you want to be debugging proration interactions with concurrent modifications at 2am, or building the product that generates the revenue you're billing for.

If you're pressure-testing a homegrown billing system against these scenarios before they surface, we're happy to walk through the architecture. No pitch, just an honest look at where the gaps usually are.

Frequently Asked Questions

Frequently Asked Questions

What causes proration bugs in subscription billing systems?

Why does dunning management fail in homegrown billing systems?

How do concurrent subscription changes cause double billing?

Why do API and subscription credits expire without warning?

What is the most common billing edge case that breaks homegrown systems at scale?

Aanchal Parmar

Aanchal Parmar

Aanchal Parmar heads content marketing at Flexprice.io. She’s been in the content for seven years across SaaS, Web3, and now AI infra. When she’s not writing about monetization, she’s either signing up for a new dance class or testing a recipe that’s definitely too ambitious for a weeknight.

Aanchal Parmar heads content marketing at Flexprice.io. She’s been in the content for seven years across SaaS, Web3, and now AI infra. When she’s not writing about monetization, she’s either signing up for a new dance class or testing a recipe that’s definitely too ambitious for a weeknight.

Share it on:

Ship Usage-Based Billing with Flexprice

Summarize this blog on:

Ship Usage-Based Billing with Flexprice

Ship Usage-Based Billing with Flexprice

More insights on billing

More insights on billing

Get Instant Feedback on Your Pricing | Join the Flexprice Community with 300+ Builders on Slack

Join the Flexprice Community on Slack