ai / mcp
I built a Model Context Protocol server that lets our team query an internal tool from Claude.
MCP is a protocol for giving LLMs access to tools. Our server exposes ~20 tools over JSON-RPC 2.0 via Streamable HTTP. Once connected, users access it from any Claude surface: web, desktop, mobile, Chrome extension, or Slack.
Deployment
IT admins add the MCP server as a Claude connector. Individual users enable it in their Claude accounts. One integration, many surfaces.
Auth
We use WorkOS AuthKit when authenticating from Claude.
The flow:
- Claude starts OAuth 2.1 + PKCE with AuthKit
- AuthKit redirects to our login page with an
external_auth_id - User authenticates via our existing SSO
- SSO callback calls the AuthKit completion API with user info
- AuthKit issues tokens and redirects back to Claude
- Claude sends a Bearer token on each
POST /mcprequest - Our server verifies the JWT (expiry, issuer, audience) via JWKS
If the user is already logged in when Claude starts the flow, we skip re-authentication and complete immediately.
AuthKit issues short-lived access tokens (~5 min). Claude refreshes them automatically. When a user is deactivated, they can't obtain new tokens. An in-flight token stays valid until expiry, but standard offboarding (deactivate IdP + Claude + app) covers that window.
Security
POST /mcp is restricted to
Anthropic's outbound IPs
via a Cloudflare WAF rule.
No other traffic reaches the endpoint.
The SSO callback validates the AuthKit redirect_uri
so that only *.workos.com and *.authkit.app domains are allowed
to provide defense-in-depth against open redirects.
All handlers are written using our web framework.
Stateless
The server is stateless.
Each POST /mcp is independent: no session state between requests.
Authentication is per-request via Bearer token.
Claude manages conversation history on its side.
The MCP spec supports stateful servers
via Mcp-Session-Id headers and GET /mcp SSE streams
for multi-step workflows or server-push notifications.
Stateful servers are harder to deploy:
session affinity, in-memory state lost on restart, horizontal scaling.
None of that is needed here.
Anthropic's client re-runs the full handshake
(initialize → notifications/initialized → tools/list)
before each tool call.
This is consistent with stateless usage.
The overhead is milliseconds.
Deploying tool changes
Tool definitions are code. They only change when we deploy. The deploy cycle is the notification mechanism:
- Old container stops, all client connections drop
- New container starts with updated tool definitions
- Claude detects disconnect, reconnects automatically
- Reconnect triggers
initialize→tools/list→ fresh definitions
The MCP spec defines notifications/tools/list_changed
for servers whose tools change at runtime without a restart.
That doesn't apply here, so we don't advertise it.
JSON-RPC dispatcher
The server handles four JSON-RPC 2.0 methods:
initialize: returns protocol version and capabilitiesnotifications/initialized: acknowledged, no responsetools/list: returns definitions for all registered toolstools/call: dispatches to the named tool
Tools are registered in a hash:
TOOLS = {
"search" => Tools::Search,
"docs" => Tools::Docs,
# ~20 tools
}.freeze
Each tools/call looks up the class, instantiates with db and user_id,
and calls it:
def call_tool(id, params, user_id: nil)
name = params["name"]
args = params["arguments"] || {}
klass = TOOLS[name]
result = klass.new(@db, user_id).call(args)
success(id, {
content: [{type: "text", text: JSON.generate(result)}]
})
end
Tool pattern
Each tool is a class with two methods:
self.definition: returns the MCP tool schema (name, description, input JSON Schema)call(args): runs the tool and returns a result hash
class Search
def self.definition
{
name: "search",
description: "Full-text search across records.",
inputSchema: {
type: "object",
properties: {
query: {type: "string", description: "Search query."},
page: {type: "integer", description: "Page number (default 1)."}
},
required: %w(query)
}
}
end
def initialize(db, user_id = nil)
@db = db
end
def call(args)
# query database, return {rows: [...], next_page: 2}
end
end
Adding a tool: create the class, add to the TOOLS hash, write tests.
Tool descriptions are where you teach the LLM
how to chain tools together.
For example, the users tool says
"Use this to resolve a person's name or initials to their user ID,
which is required by tools like network_by_user."
Claude reads these descriptions and learns the composition order.
Docs tool
One tool we like is docs.
It has no database queries;
just a hash of topic names to markdown strings
that describe how the app works.
Calling it with no topic returns an index of all available topics
so Claude can discover what documentation exists before fetching one.
Claude calls docs when it needs to understand the domain
before answering a question.
A lightweight way to embed institutional knowledge
into the LLM without fine-tuning or RAG.
JWT verification
JWTs are verified against the AuthKit JWKS endpoint:
- Algorithm: RS256
- Validates: issuer, audience, expiration
- JWKS are fetched lazily and cached
- On
kid_not_found(key rotation), re-fetches JWKS but at most once per 5 minutes to prevent cache-busting attacks from tokens with randomkidvalues
JWT.decode(
token, nil, true,
algorithms: ["RS256"],
jwks: jwks_resolver,
iss: @issuer, verify_iss: true,
aud: @audience, verify_aud: true,
verify_expiration: true
)
Pagination
Tools that return lists paginate with PAGE_SIZE + 1:
fetch one extra row to know if there's a next page
without a separate count query.
module Paginate
PAGE_SIZE = 50
def self.call(rows, page)
offset = (page - 1) * PAGE_SIZE
sliced = rows[offset, PAGE_SIZE + 1] || []
has_next = sliced.size > PAGE_SIZE
{rows: sliced.first(PAGE_SIZE), next_page: has_next ? page + 1 : nil}
end
end
Claude sends next_page from one response as the page argument
in the next request to paginate through results.