Tollwicket

← Back to blog

How to monetize an MCP server: a practical guide

Step-by-step playbook for charging money for MCP tools — pricing models, auth, quotas, billing infrastructure, and the failure modes most authors hit.

You built an MCP server. People use it. Now you want to get paid.

This post is the guide we wished existed when we started building Tollwicket. It is opinionated — there are several ways to monetize an MCP server, and we will spend most of the post explaining when each approach makes sense and where it breaks.

What "monetizing an MCP server" actually means

A Model Context Protocol (MCP) server exposes tools that LLM clients (Claude, Cursor, Goose, VS Code, Continue) can call on behalf of a user. When you charge for that server, you are charging the end user of the LLM — not the LLM vendor.

That distinction matters. It rules out a few approaches that intuitively seem reasonable:

  • You cannot bill Anthropic / OpenAI / Cursor. They do not have payment rails for tool authors and have shown no signs of building any. Trying to insert yourself into their billing is a dead end.
  • You cannot reliably charge "per token" the way an LLM provider does. The MCP protocol does not expose token usage from the LLM. You can charge per tool call, per unit of work your tool produces, or per subscription seat — but not per LLM token.
  • You cannot assume a stable user identity from the client alone. MCP clients pass a transport-level connection, not an authenticated user. You have to do auth yourself.

Once you accept those three constraints, the problem narrows down: you need to (1) identify the user, (2) decide what to charge for, (3) gate access on payment, and (4) collect the money. The rest of this post walks through each.

Step 1: Identify the user

The most common mistake is skipping this step and trying to bolt auth on later. Don't. Build identity first.

The cleanest pattern we've seen is OAuth-on-first-use:

  1. User installs your MCP server in their Claude / Cursor / etc. config.
  2. First tool call returns an error containing a one-time URL.
  3. User clicks the URL, signs in with Google or GitHub, and the page sets a credential on their MCP server config.
  4. Subsequent tool calls carry that credential as a header or argument.

This works because MCP clients support arbitrary headers and the user has a browser at hand. It does not work for fully headless setups (CI, agents), but those are a small minority of real users and you can add machine tokens later.

If you do not want to build the OAuth flow yourself, this is exactly the kind of thing where a billing SDK like Tollwicket can save you a week of work — the SDK ships an auth.yourdomain.com flow you point users at.

Step 2: Decide what to charge for

Pricing is the part most engineers underthink. There are four common models:

A. Flat monthly subscription

Simplest. User pays $9/mo, gets unlimited (or large-quota) access. Best when:

  • Your tool's marginal cost per call is near zero (a cached lookup, a small computation).
  • Your audience is non-technical and prefers predictable bills.
  • You want to maximize conversion rate over revenue per user.

B. Per-call usage

User is metered per tool call. Best when:

  • The unit of work is expensive (paid API behind your tool, GPU inference, large database queries).
  • Power users would dramatically out-consume average users on a flat plan.
  • You can absorb the support overhead of "why is my bill $42 this month?"

C. Tiered subscription with quotas

Hybrid: flat fee, with a hard or soft cap. E.g., "$15/mo for 10,000 calls, then 1¢ per call." This is the model that converts best for most MCP tools in practice. It gives users predictability on the low end and a graceful upgrade path on the high end.

D. One-time / lifetime

Pay $49 once, use forever. Rarely the right choice for an MCP server — server costs are recurring, but revenue is not. Use this only if your "server" is actually a static dataset or a one-shot computation.

Our recommendation for most MCP authors: start with model (C), tiered subscription with quotas. Pick three tiers, set the free tier generous enough that hobbyists never pay, and make the second tier cheap enough that any real user crosses it.

Step 3: Gate access on payment

This is the technical heart of the problem. Every tool call needs to check:

  1. Is the user authenticated?
  2. Does their plan allow this tool?
  3. Do they have quota remaining?
  4. If not, what error do we return?

The fourth question is more important than it looks. The error you return is the LLM's only signal to ask the user to upgrade. If your error is generic ("Internal error"), the LLM will retry or hallucinate. If it is specific ("Quota exceeded for tool search_papers. Upgrade at https://you.com/upgrade"), Claude will paste the URL to the user and the user will click it.

The pattern we use in our SDK looks like this:

from mcp_billing import billed_tool

@billed_tool(
    plan="pro",
    price_per_call=0.01,
    daily_quota_free=100,
)
async def search_papers(query: str) -> list[dict]:
    return await _do_the_search(query)

The decorator handles the four checks above and returns a structured upgrade URL when the user is out of quota. You can build the same thing yourself; the work is straightforward, just tedious.

Step 4: Collect the money

Three choices here, in order of how much work they are:

Use a billing SDK (least work)

Tollwicket, Stigg, Orb, and a few others provide drop-in billing layers. You bring your own Stripe account; they handle plans, quotas, webhooks, and the upgrade UX. Cost: a SaaS fee (often free below some revenue threshold).

Build directly on Stripe (medium work)

Create products and prices in your Stripe dashboard, build a Checkout flow, listen to webhooks to sync subscription state, store entitlements in your own database, and write the quota-check code yourself. Cost: ~2 weeks of engineering plus ongoing maintenance every time Stripe changes their API.

We wrote a full guide on adding Stripe directly to an MCP server if this is the path you want to take.

Build a custom billing stack (most work)

Roll your own payments, store cards yourself, handle PCI compliance. Almost never the right call. Mentioned only for completeness.

The failure modes nobody warns you about

We've watched dozens of MCP authors ship paid servers. The patterns of failure are remarkably consistent:

Failure mode 1: Conflating identity and billing. A user installs your server on two machines. Are those two installs the "same user" for quota purposes? You need to decide before users start asking. The simplest answer is: identity is the OAuth account, and quotas roll up to identity. Devices are billing-irrelevant.

Failure mode 2: Returning unhelpful errors. "403 Forbidden" tells an LLM nothing. Always return structured errors with an upgrade URL.

Failure mode 3: Pricing too low. MCP tools have power-law usage distributions. A few users will hammer your tool. If your pricing assumes uniform usage, the heavy users will eat your margin. Tier accordingly.

Failure mode 4: No grace period. A user crosses their quota at 11pm on a Saturday. Your server returns "upgrade required" and the user, mid-task with Claude, gets frustrated. Build a soft cap: warn at 80%, allow a 10% burst, hard-block only after a 24-hour grace.

Failure mode 5: No way to leave. If a user has to email you to cancel, they will leave anyway and tell their friends not to use your tool. Self-serve cancellation is non-negotiable.

A 30-day plan if you're starting from zero

If you have a working unpaid MCP server and want to start charging within 30 days:

  • Week 1: Add OAuth-on-first-use. Build the upgrade page. Decide on three pricing tiers and write them down on the page. No Stripe yet.
  • Week 2: Wire up Stripe Checkout. Three Products, three Prices, three webhook handlers. Test in test mode with your own card.
  • Week 3: Add quota-checking to every tool. Pick a soft-cap policy. Write the upgrade-prompt error.
  • Week 4: Soft launch to your existing users. Watch the signups, watch the support questions, and iterate.

By day 30 you will have paying users — usually fewer than you hoped, but enough to tell you whether the product has legs.

What to do next

If you want to skip the two weeks of Stripe wiring, Tollwicket is a one-decorator drop-in that does steps 1–4 above. You bring your own Stripe account, we bring the auth, quotas, checkout, and webhook machinery. Free until your MCP server is making $500/mo in customer revenue.

If you want the long-form on each piece, read on:

Ship a paid MCP tool this weekend.

Drop one Python decorator. Tollwicket handles auth, quotas, and Stripe Checkout — on your own Stripe account. Free until you cross $500/mo of customer revenue.