May 7, 2026

Usage-based vs. subscription billing for AI tools: how to choose

The tradeoffs between usage-based and subscription pricing for MCP servers and AI tools, with a decision framework and three real-world scenarios.

If you've ever read AWS's bill or wired up Stripe Meters, you know usage-based billing. If you've ever subscribed to Spotify, you know subscription billing. Most discussions about which is "better" for AI products miss that the right answer depends on your cost structure, your buyer, and how much support load you can absorb.

This post is the short version of how to choose.

The two models, plainly

Subscription: Customer pays a fixed amount per time period for a defined scope of access. ($29/mo for up to 10,000 calls.)

Usage-based: Customer pays per unit consumed. ($0.005 per call, billed at end of month.)

Many real products are hybrids: a base subscription plus overage, or a small free tier plus pure usage above it. We'll get to hybrids.

What each model is good at

Subscription is good at:

Predictability for the user. The user knows what their bill will be. This matters enormously for non-finance buyers.
Smoothing your revenue. You collect $29 from a user whether they make 100 calls or 9,000 calls. Your monthly revenue is far less volatile.
Conversion rate. Users sign up faster when the bill is bounded. "How much will this cost me?" has a one-line answer.
Low support load. Bill disputes are rare. Refund requests are predictable.

Usage-based is good at:

Fairness. A user making 100 calls pays 1/90th of what a user making 9,000 calls pays. Heavy users are not subsidizing light users.
Capturing variance. If your top decile uses 50x as much as the median, usage-based captures that revenue. Subscription with quotas captures only the tier difference.
Cost alignment. If your underlying costs are pure variable (a paid LLM behind your tool, a paid DB), usage-based keeps your margin constant.
Top-line growth scaling with adoption. As a user uses more, you earn more, without needing them to "upgrade."

What each is bad at

Subscription is bad at:

Heavy users. They eat your margin. Without quotas, one whale can flip a customer from profitable to underwater.
Cost variance. If your tool's cost per call swings 100x (cheap lookups vs. expensive computations), subscription gives one price for both — neither feels right.

Usage-based is bad at:

Surprise bills. This is the killer. A user runs an agent overnight, the agent loops, and they wake up to a $400 charge. They will dispute it. You will lose the dispute. You will refund.
Adoption friction. "How much will this cost?" has no clean answer. Users hesitate to even try.
Support load. Every line item is a potential support ticket.

The decision framework

Three questions:

Question 1: Is your marginal cost per call material?

If yes (you pay for a downstream API, GPU time, expensive DB queries), you cannot reasonably offer subscription with unlimited usage. You either need usage-based, or subscription with hard quotas.

If no (cached lookups, cheap computations, mostly storage costs), pure subscription is on the table.

Question 2: How sophisticated is your buyer?

Developers and ops teams understand metered billing intuitively. They live in AWS, GCP, Datadog — all metered. Selling them usage-based pricing is fine.

Consumers and prosumers don't. Selling them usage-based pricing is selling them anxiety. They will under-use your tool because they're afraid of the bill.

If your buyer is a developer at a real company, usage-based is fine. If your buyer is a side-project hacker or a non-technical end user, subscription is the only honest choice.

Question 3: Can you absorb support load?

Usage-based billing creates support load proportional to your user count. Bill disputes, refund requests, "why was this call $1.20," etc. This is real ongoing engineering and CX effort.

If you're a solo founder, this load will eat you. Don't choose pure usage-based unless you have the team for it.

The hybrid: subscription with capped overage

For most AI tools, the right model is hybrid: a tiered subscription with a generous quota, plus overage billing for users who blow past their tier, with a hard cap on overage per month.

Concretely:

Hobby: $9/mo, 1,000 calls, $0.02/call overage, capped at $30/mo total overage.
Pro: $29/mo, 10,000 calls, $0.01/call overage, capped at $100/mo overage.
Team: $99/mo, 100,000 calls, $0.005/call overage, capped at $500/mo overage.

This gives you:

The conversion-friendliness of subscription (the user knows their floor).
The fairness of usage-based (heavy users pay more).
Bounded downside (the cap means no surprise $5,000 bill).
Bounded support load (most bills are predictable subscription amounts).

The overage cap is the secret. Without it, hybrid is just usage-based with extra steps.

Three scenarios

Scenario A: A research-paper search tool

Marginal cost per call: low (cached + small DB).
Buyer: academic / researcher (prosumer).
Solo founder, can't take much support load.

Verdict: Pure tiered subscription. No overage. The cost variance doesn't justify it. Tier the free plan generously enough that hobbyists never pay, and price the paid tiers for active researchers.

Scenario B: An MCP server that runs heavy data analysis on user-uploaded files

Marginal cost per call: high (CPU time, sometimes GPU).
Buyer: dev or data analyst at a company (sophisticated).
Small team, can take some support load.

Verdict: Hybrid with capped overage. Three tiers, generous quotas, marked-up overage above the cap, monthly overage ceiling. This captures heavy-user revenue without scaring off light users.

Scenario C: A code-completion-style MCP server hitting a paid LLM provider

Marginal cost per call: very high (downstream LLM).
Buyer: developer (technical).
VC-backed startup, can take a lot of support load.

Verdict: Pure usage-based or hybrid with high cap. Cost is mostly variable, buyer is technical. Bill at a 2-3x markup on your LLM cost. Provide a dashboard showing real-time usage so users can self-police.

A note on free tiers

Whichever model you pick, the free tier sets the conversion ceiling. A free tier that lets users do real work converts much better than a 7-day trial, because users build a habit before they pay. Make the free tier generous enough to be useful, restrictive enough that "real" usage requires upgrading.

For our pricing at Tollwicket, the free tier is the entire SDK, fully functional, until the customer sends $500/mo through it. That's an unusual choice — we only start charging the MCP author once their tool is making real money. It works because our buyer is the author, and the author doesn't want to pay until they have customers.

What to do next

Answer the three questions above honestly.
Pick subscription, hybrid, or usage-based.
If hybrid, set the overage cap on day one. Don't promise to add it "later" — you won't.
Instrument: track free-to-paid conversion, ARPU by cohort, churn rate. Without these numbers you can't iterate.

If you want to skip the implementation work for any of these models, Tollwicket supports tiered subscription, hybrid with overage, and pure usage-based out of the box. Switching between them is a config change, not a re-integration.