ai / mcp

I built a Model Context Protocol server that lets our team query an internal tool from Claude.

MCP is a protocol for giving LLMs access to tools. Our server exposes ~20 tools over JSON-RPC 2.0 via Streamable HTTP. Once connected, users access it from any Claude surface: web, desktop, mobile, Chrome extension, or Slack.

Deployment

IT admins add the MCP server as a Claude connector. Individual users enable it in their Claude accounts. One integration, many surfaces.

Auth

We use WorkOS AuthKit when authenticating from Claude.

The flow:

  1. Claude starts OAuth 2.1 + PKCE with AuthKit
  2. AuthKit redirects to our login page with an external_auth_id
  3. User authenticates via our existing SSO
  4. SSO callback calls the AuthKit completion API with user info
  5. AuthKit issues tokens and redirects back to Claude
  6. Claude sends a Bearer token on each POST /mcp request
  7. Our server verifies the JWT (expiry, issuer, audience) via JWKS

If the user is already logged in when Claude starts the flow, we skip re-authentication and complete immediately.

AuthKit issues short-lived access tokens (~5 min). Claude refreshes them automatically. When a user is deactivated, they can't obtain new tokens. An in-flight token stays valid until expiry, but standard offboarding (deactivate IdP + Claude + app) covers that window.

Security

POST /mcp is restricted to Anthropic's outbound IPs via a Cloudflare WAF rule. No other traffic reaches the endpoint.

The SSO callback validates the AuthKit redirect_uri so that only *.workos.com and *.authkit.app domains are allowed to provide defense-in-depth against open redirects.

All handlers are written using our web framework.

Stateless

The server is stateless. Each POST /mcp is independent: no session state between requests. Authentication is per-request via Bearer token. Claude manages conversation history on its side.

The MCP spec supports stateful servers via Mcp-Session-Id headers and GET /mcp SSE streams for multi-step workflows or server-push notifications. Stateful servers are harder to deploy: session affinity, in-memory state lost on restart, horizontal scaling. None of that is needed here.

Anthropic's client re-runs the full handshake (initializenotifications/initializedtools/list) before each tool call. This is consistent with stateless usage. The overhead is milliseconds.

Deploying tool changes

Tool definitions are code. They only change when we deploy. The deploy cycle is the notification mechanism:

  1. Old container stops, all client connections drop
  2. New container starts with updated tool definitions
  3. Claude detects disconnect, reconnects automatically
  4. Reconnect triggers initializetools/list → fresh definitions

The MCP spec defines notifications/tools/list_changed for servers whose tools change at runtime without a restart. That doesn't apply here, so we don't advertise it.

JSON-RPC dispatcher

The server handles four JSON-RPC 2.0 methods:

Tools are registered in a hash:

TOOLS = {
  "search" => Tools::Search,
  "docs" => Tools::Docs,
  # ~20 tools
}.freeze

Each tools/call looks up the class, instantiates with db and user_id, and calls it:

def call_tool(id, params, user_id: nil)
  name = params["name"]
  args = params["arguments"] || {}
  klass = TOOLS[name]
  result = klass.new(@db, user_id).call(args)

  success(id, {
    content: [{type: "text", text: JSON.generate(result)}]
  })
end

Tool pattern

Each tool is a class with two methods:

class Search
  def self.definition
    {
      name: "search",
      description: "Full-text search across records.",
      inputSchema: {
        type: "object",
        properties: {
          query: {type: "string", description: "Search query."},
          page: {type: "integer", description: "Page number (default 1)."}
        },
        required: %w(query)
      }
    }
  end

  def initialize(db, user_id = nil)
    @db = db
  end

  def call(args)
    # query database, return {rows: [...], next_page: 2}
  end
end

Adding a tool: create the class, add to the TOOLS hash, write tests.

Tool descriptions are where you teach the LLM how to chain tools together. For example, the users tool says "Use this to resolve a person's name or initials to their user ID, which is required by tools like network_by_user." Claude reads these descriptions and learns the composition order.

Docs tool

One tool we like is docs. It has no database queries; just a hash of topic names to markdown strings that describe how the app works. Calling it with no topic returns an index of all available topics so Claude can discover what documentation exists before fetching one.

Claude calls docs when it needs to understand the domain before answering a question. A lightweight way to embed institutional knowledge into the LLM without fine-tuning or RAG.

JWT verification

JWTs are verified against the AuthKit JWKS endpoint:

JWT.decode(
  token, nil, true,
  algorithms: ["RS256"],
  jwks: jwks_resolver,
  iss: @issuer, verify_iss: true,
  aud: @audience, verify_aud: true,
  verify_expiration: true
)

Pagination

Tools that return lists paginate with PAGE_SIZE + 1: fetch one extra row to know if there's a next page without a separate count query.

module Paginate
  PAGE_SIZE = 50

  def self.call(rows, page)
    offset = (page - 1) * PAGE_SIZE
    sliced = rows[offset, PAGE_SIZE + 1] || []
    has_next = sliced.size > PAGE_SIZE
    {rows: sliced.first(PAGE_SIZE), next_page: has_next ? page + 1 : nil}
  end
end

Claude sends next_page from one response as the page argument in the next request to paginate through results.

← All articles