MCP billing patterns: which one fits your tool
Subscription, per-call, hybrid, lifetime — the five pricing patterns we see across paid MCP servers, what each one is good and bad at, and how to pick.
Most pricing advice on the internet is written for SaaS products that humans log into and use. MCP servers are not that. The "user" is an LLM acting on behalf of a human, the marginal cost per call varies wildly, and the demand curve is a power law with a heavy tail.
This post catalogs the five pricing patterns we see across paid MCP servers, what each is good at, and when each one breaks.
1. Flat subscription
Shape: $X per month, unlimited or near-unlimited calls.
When it works:
- Your tool's marginal cost per call is near zero. A static knowledge base, a small computation, a cached lookup.
- The variance in usage across users is low. (Hobby project + small business + agency all use roughly the same number of calls per month.)
- Your audience is non-technical and would prefer a predictable bill over a "fair" bill.
When it breaks:
- A single power user makes 100x as many calls as the median user. Now your unit economics are broken on the heavy users and the light users are subsidizing them.
- Your costs scale with usage (paid API behind your tool, GPU time, expensive database). The first whale will wipe out your margin.
Verdict: Good default for cheap-to-run tools serving consistent users. Bad for anything with high cost variance.
2. Per-call usage billing
Shape: $0.0X per tool call, billed at month end via Stripe Meters or your own metering.
When it works:
- Each call has real, measurable cost.
- Your users are technical and understand metered billing intuitively (developers, ops teams).
- You want revenue to scale linearly with usage.
When it breaks:
- Non-technical users get a surprise bill and churn angry.
- LLM clients aggressively retry on errors. A bug that returns 500s could ring up real money on a user's card before you notice. (You will get a chargeback.)
- The cognitive overhead of "how much will this cost?" depresses your top-of-funnel adoption.
Verdict: Right for expensive tools serving sophisticated users. Wrong for consumer-grade tools or for anything where the per-call cost is unpredictable.
3. Tiered subscription with hard quotas
Shape: Hobby ($9/mo, 1k calls), Pro ($29/mo, 10k calls), Team ($99/mo, 100k calls). Calls beyond the quota are blocked.
When it works:
- You want predictability for users and protection for yourself.
- The quotas can be set such that 95% of users never come close to their cap.
- Crossing the cap signals real value, so users self-select into upgrading.
When it breaks:
- A user hits the cap on a Saturday and can't get work done until Monday. They cancel.
- The friction between tiers is too high — going from "1k calls for $9" to "10k calls for $29" is a 3x price jump for 10x calls; users feel coerced.
Verdict: The model that converts best in practice for most MCP tools. Especially if you build in a soft cap (see below).
4. Tiered subscription with overage
Shape: Same as #3, but calls beyond the cap are billed per-call at a marked-up rate.
When it works:
- Heavy users get to keep using your tool when they spike.
- You capture more revenue from your top decile without forcing them onto enterprise contracts.
- Soft caps make the model forgiving — the user can't get "stuck" mid-task.
When it breaks:
- Overage billing makes invoices unpredictable. Users hate this even if they signed up for it.
- Disputes ("I didn't realize my agent was looping") are real and you'll eat refunds.
Verdict: Probably the best model if you're willing to handle the support load. Set a hard maximum overage per month so users feel safe.
5. Per-seat / team licensing
Shape: $X per user per month, with a minimum.
When it works:
- Your tool is a productivity multiplier (a code assistant, a research tool, a writing aid) and your customers are companies.
- The buying decision is made by a manager, not by the end user.
- You sell into companies large enough that procurement matters.
When it breaks:
- Your buyer is the end user, not a manager. Per-seat reads as "complicated" and they bounce.
- Calls are made by agents on the user's behalf, and "how many seats do we have?" is ambiguous when one human runs three concurrent Claude tabs.
Verdict: Right model for enterprise sales motion, wrong for prosumer / dev-tool sales motion.
How to pick
Three questions, in order:
- Is my marginal cost per call material? If yes, you cannot use flat (#1). Skip to #2, #3, or #4.
- Are my users technical / sophisticated buyers? If no, avoid #2. Stick to #3 or #4. Predictable bills are non-negotiable for consumer/prosumer.
- Is my buyer the user or a manager? If a manager, consider #5. If the user, you want #3 or #4.
For the median MCP tool — non-trivial cost per call, prosumer audience, user is also the buyer — the answer is #4: tiered subscription with overage. Three tiers, generous quotas, capped overage to prevent runaway bills.
The detail nobody talks about: the upgrade path
The pricing model is half the story. The upgrade path — what happens when a user hits their cap — is the other half, and most people don't think about it at all.
A user installs your MCP server. They make 1,000 calls in their first week (this is normal for an excited new user). They hit the free-tier cap. What happens next determines whether you have a customer or a one-week tourist.
Bad path: Tool returns "403 Forbidden." LLM says "I encountered an error." User shrugs and moves on. You never hear from them.
Better path: Tool returns an error containing an upgrade URL. LLM pastes the URL. User clicks. The URL goes to a page that says "You've used 1,000 of 1,000 free calls — upgrade to Pro for $29/mo." User decides.
Best path: Same as above, but the URL is pre-authenticated. User clicks → Stripe Checkout opens → user pays → returns to the LLM and the next tool call works. No re-login, no friction, no opportunity to lose the user to a bug in your auth flow.
That last pattern is what we built Tollwicket to do by default — the decorator returns a pre-signed upgrade URL on quota exceeded.
The other detail nobody talks about: downgrade
What happens when a user's plan goes from "active" to "past_due" because their card expired? Most billing systems treat that as a hard cut. The user makes a tool call and gets an error. The user emails support. You spend an evening figuring out it was an expired card.
Better: when a subscription enters past_due, give the user 7–14 days of continued access while Stripe retries the card. The user gets an email. If the email isn't fixed by day 14, hard-block. This converts dramatically better than a hard cut and Stripe's retry logic is genuinely good — most past_due subs recover on their own.
The really detail nobody talks about: refunds
You will refund people. Some will be honest mistakes (signed up for the wrong tier), some will be agent-loop-induced overcharges, some will be users who just changed their mind. Decide your refund policy before the first one and write it down.
Our recommendation: full refund within 14 days, no questions asked. Pro-rate after that for the unused portion of the month. The number of fraudulent refund requests is tiny; the goodwill of a generous policy is large.
What to do next
- Pick a model (probably #4).
- Set quotas that 90% of users will never hit and 10% will gracefully cross into paying.
- Build the upgrade path: structured errors, pre-signed URLs, fast checkout.
- Plan for past_due and refunds before you take your first dollar.
If the implementation feels like a lot of work — it is. Tollwicket packages all of the above (model #4, soft caps, pre-signed upgrade URLs, graceful past_due) behind one Python decorator.
Related reading
Ship a paid MCP tool this weekend.
Drop one Python decorator. Tollwicket handles auth, quotas, and Stripe Checkout — on your own Stripe account. Free until you cross $500/mo of customer revenue.