Python SDK Reference

Complete API reference for mcp-gateway-sdk (v0.1.0), the PRECINCT Python SDK for integrating agents with the security gateway. Built on httpx, requires Python 3.13, and works with any agent framework or raw HTTP.

First Time Integrating?

Start with the Integration Guide to register your agent's SPIFFE identity and configure gateway policies before using the SDK. The SDKs overview page covers both the Python and Go SDKs side by side.

Overview

mcp-gateway-sdk provides a single GatewayClient class that handles all communication with the PRECINCT gateway. It constructs MCP JSON-RPC envelopes, injects SPIFFE identity headers, maps gateway error responses to structured Python exceptions, and retries on transient 503 failures with exponential backoff.

The SDK is framework-independent and works with:

PydanticAI, via tool wrapper functions
DSPy, via build_dspy_gateway_lm() runtime helper
LangGraph / LangChain, via tool functions or raw calls
Raw httpx, for custom or minimal integrations

Public API exports (from mcp_gateway_sdk):

from mcp_gateway_sdk import (
    GatewayClient,
    GatewayError,
    build_dspy_gateway_lm,
    build_spike_token_ref,
    configure_dspy_gateway_lms,
    load_dotenv,
    normalize_model_name,
    resolve_model_api_key_ref,
    setup_observability,
)

Installation

Core install

# From this repository
uv sync --project sdk/python --python 3.13

# From another uv-managed project
uv add ./sdk/python

Optional dependency groups

The SDK defines three optional dependency groups in pyproject.toml. Install them with bracket syntax:

Optional dependency groups for mcp-gateway-sdk
Group	Install command	Packages	Purpose
`env`	`uv sync --project sdk/python --python 3.13 --extra env`	`python-dotenv>=1.0.1`	Enable `load_dotenv()` helper
`otel`	`uv sync --project sdk/python --python 3.13 --extra otel`	`opentelemetry-api>=1.39.0`	Enable `setup_observability()` and tracer wiring
`dev`	`uv sync --project sdk/python --python 3.13 --extra dev`	`pytest>=9.0.0`, `httpx>=0.28.0`	Development and testing

# Install all optional groups at once
uv sync --project sdk/python --python 3.13 --extra env --extra otel --extra dev

Version Note

The SDK is part of the PRECINCT proof-of-concept and is versioned alongside the main repository (currently v0.1.0). For production use, pin to a specific commit or tag.

Quick Start

The GatewayClient supports both context-manager and manual lifecycle patterns. The context manager ensures the underlying httpx.Client is properly closed:

from mcp_gateway_sdk import GatewayClient, GatewayError

with GatewayClient(
    url="http://localhost:9090",
    spiffe_id="spiffe://poc.local/agents/mcp-client/dspy-researcher/dev",
) as client:
    try:
        result = client.call("tavily_search", query="AI security", max_results=5)
        print(result)  # raw MCP JSON-RPC result dict
    except GatewayError as e:
        print(f"Denied: {e.code} - {e.message}")
        print(f"  Middleware: {e.middleware} (step {e.step})")
        print(f"  Remediation: {e.remediation}")

Manual lifecycle (equivalent, but you must call close() yourself):

client = GatewayClient(
    url="http://localhost:9090",
    spiffe_id="spiffe://poc.local/agents/mcp-client/dspy-researcher/dev",
)
try:
    result = client.call("tavily_search", query="AI security")
    print(result)
except GatewayError as e:
    print(f"Denied: {e.code}: {e.remediation}")
finally:
    client.close()

GatewayClient API Reference

mcp_gateway_sdk.GatewayClient is the primary class for interacting with the gateway. It is a synchronous, thread-safe client built on httpx.Client.

Constructor

GatewayClient(
    url: str,
    spiffe_id: str,
    *,
    session_id: str | None = None,
    tracer: Any = None,
    timeout: float = 30.0,
    max_retries: int = 3,
    backoff_base: float = 1.0,
)

GatewayClient constructor parameters
Parameter	Type	Default	Description
`url`	`str`	(required)	Gateway base URL, e.g. `"http://localhost:9090"`.
`spiffe_id`	`str`	(required)	SPIFFE identity string sent as the `X-SPIFFE-ID` header on every request.
`session_id`	`str \| None`	`None`	Session identifier for `X-Session-ID`. Auto-generated UUID if omitted.
`tracer`	`Any`	`None`	Optional OpenTelemetry `Tracer` for span creation around tool calls.
`timeout`	`float`	`30.0`	HTTP request timeout in seconds. Passed to the underlying `httpx.Client`.
`max_retries`	`int`	`3`	Maximum retry attempts for 503 responses. Total attempts = `max_retries + 1`.
`backoff_base`	`float`	`1.0`	Base value for exponential backoff (seconds). Delay = `backoff_base * 2^attempt`.

Methods

`call(tool_name, **params) -> Any`

Call an MCP tool through the gateway using the tools/call JSON-RPC method. This is the primary method for tool invocation.

# Call a search tool
result = client.call("tavily_search", query="AI security", max_results=5)

# Call a file-read tool
content = client.call("read", path="/etc/hostname")

Args: tool_name (str), the MCP tool name; **params, keyword arguments passed as params.arguments in the JSON-RPC envelope.
Returns: the result field from the JSON-RPC response (dict or value).
Raises: GatewayError on 4xx/5xx responses or JSON-RPC errors; httpx.ConnectError if the gateway is unreachable.

When a tracer is configured, call() creates an OTel span named gateway.tool_call.<tool_name> with attributes for mcp.method, mcp.tool.name, spiffe.id, and session.id.

`call_rpc(method, params=None) -> Any`

Call a raw MCP JSON-RPC method through the gateway. Use this for protocol-level methods like tools/list or resources/read. For tool invocations, prefer call().

# List available tools
tools = client.call_rpc("tools/list")

# Read a resource
resource = client.call_rpc("resources/read", {"uri": "file:///data/config.yaml"})

`call_model_chat(...) -> Any`

Call the gateway's OpenAI-compatible model egress endpoint. This keeps model calls behind the gateway's model-plane controls (DLP, rate limiting, deep scan) while providing a simple SDK interface.

response = client.call_model_chat(
    model="llama-3.3-70b-versatile",
    messages=[{"role": "user", "content": "Summarize this document."}],
    provider="groq",
    api_key_ref="Bearer $SPIKE{ref:secrets/groq-api-key,exp:3600}",
)

call_model_chat parameters
Parameter	Type	Default	Description
`model`	`str`	(required)	Model identifier (e.g. `"llama-3.3-70b-versatile"`).
`messages`	`list[dict]`	(required)	OpenAI-format messages array.
`provider`	`str`	`"groq"`	Model provider, sent as `X-Model-Provider` header.
`api_key_ref`	`str \| None`	`None`	API key reference (typically a SPIKE token ref). Sent as the `Authorization` header.
`api_key_header`	`str`	`"Authorization"`	Header name for the API key.
`endpoint`	`str`	`"/openai/v1/chat/completions"`	Gateway model egress path.
`residency_intent`	`str`	`"us"`	Data residency intent, sent as `X-Residency-Intent`.
`budget_profile`	`str`	`"standard"`	Budget profile, sent as `X-Budget-Profile`.
`extra_headers`	`dict \| None`	`None`	Additional headers merged into the request.
`**extra_payload`	`Any`	--	Additional keys merged into the JSON request body (e.g. `temperature`, `max_tokens`).

`close() -> None`

Close the underlying httpx.Client. Called automatically when using the context manager (with statement).

Property: `session_id`

The session identifier sent as X-Session-ID on every request. Set at construction time (auto-generated UUID if not provided). Read-only after construction. Access via client.session_id.

GatewayError

mcp_gateway_sdk.GatewayError is the structured exception type raised for all gateway denials. It mirrors the unified JSON error envelope defined by the gateway (Go struct middleware.GatewayError).

Attributes

GatewayError attributes
Attribute	Type	Default	Description
`code`	`str`	`""`	Machine-readable error code (e.g. `"authz_policy_denied"`).
`message`	`str`	`""`	Human-readable description of the denial.
`reason_code`	`str`	`""`	Stable reason identifier for policy or UI handling.
`middleware`	`str`	`""`	Which middleware layer rejected the request (e.g. `"opa"`, `"dlp"`).
`step`	`int`	`0`	Middleware step number in the chain (maps to `middleware_step` in the JSON envelope).
`decision_id`	`str`	`""`	Audit decision ID for cross-referencing with gateway logs.
`trace_id`	`str`	`""`	OpenTelemetry trace ID for distributed tracing correlation.
`details`	`dict[str, Any]`	`{}`	Optional structured details (risk scores, matched patterns, etc.).
`remediation`	`str`	`""`	Optional remediation guidance for the caller.
`docs_url`	`str`	`""`	Optional link to documentation for the error.
`http_status`	`int`	`0`	HTTP status code from the gateway response.

Class method: `from_response(http_status, body)`

Parse a GatewayError from an HTTP response body dict. This is called internally by GatewayClient when the gateway returns a 4xx or 5xx status.

# Internal usage (shown for reference):
error = GatewayError.from_response(
    http_status=403,
    body={
        "code": "authz_policy_denied",
        "message": "OPA policy denied access to tool 'read'",
        "reason_code": "tool_not_permitted",
        "middleware": "opa",
        "middleware_step": 6,
        "decision_id": "d-abc123",
        "trace_id": "t-xyz789",
        "remediation": "Request access via your team's OPA policy admin.",
        "docs_url": "https://precinct.dev/pages/opa.html",
    },
)

Error handling patterns

from mcp_gateway_sdk import GatewayClient, GatewayError

with GatewayClient(url="http://localhost:9090",
                   spiffe_id="spiffe://poc.local/agents/my-agent/dev") as client:
    try:
        result = client.call("sensitive_tool", data="payload")
    except GatewayError as e:
        # Branch on the machine-readable error code
        if e.code == "authz_policy_denied":
            print(f"Policy denied (decision: {e.decision_id})")
        elif e.code == "dlp_credentials_detected":
            print(f"DLP blocked credentials in request")
        elif e.code == "ratelimit_exceeded":
            print(f"Rate limited (HTTP {e.http_status})")
        elif e.code == "circuit_open":
            print(f"Circuit breaker open for backend")
        else:
            print(f"Gateway error: {e.code}: {e.message}")

        # All errors carry the trace ID for correlation
        if e.trace_id:
            print(f"  Trace: {e.trace_id}")
        if e.remediation:
            print(f"  Fix: {e.remediation}")

Error Code Catalog

The gateway defines a fixed set of machine-readable error codes. Each code identifies the middleware layer that rejected the request and maps to a specific HTTP status. The GatewayError.code attribute will be one of these values.

Gateway error code catalog
Error Code	HTTP	Step	Middleware	Description
`request_too_large`	413	1	Request Size	Request body exceeds the configured size limit.
`auth_missing_identity`	401	3	SPIFFE Auth	No `X-SPIFFE-ID` header present.
`auth_invalid_identity`	401	3	SPIFFE Auth	The `X-SPIFFE-ID` value is malformed or unrecognized.
`registry_tool_unknown`	403	5	Tool Registry	The requested tool is not in the gateway's tool registry.
`registry_hash_mismatch`	403	5	Tool Registry	Tool definition hash does not match the registered hash (rug-pull detection).
`authz_policy_denied`	403	6	OPA Policy	OPA policy evaluation denied the request.
`authz_no_matching_grant`	403	6	OPA Policy	No policy grant matches this identity/tool combination.
`authz_tool_not_found`	403	6	OPA Policy	Tool not found during OPA evaluation (distinct from registry check).
`dlp_credentials_detected`	403	7	DLP	DLP scanner detected credentials (API keys, tokens) in the request.
`dlp_injection_blocked`	403	7	DLP	DLP scanner blocked a prompt injection attempt (policy = block).
`dlp_pii_blocked`	403	7	DLP	DLP scanner blocked PII (policy = block).
`dlp_unavailable_fail_closed`	503	7	DLP	DLP scanner is unavailable and fail-closed policy is active.
`exfiltration_detected`	403	8	Session Context	Data exfiltration pattern detected across session context.
`stepup_denied`	403	9	Step-Up Gating	Step-up verification denied the request.
`stepup_approval_required`	403	9	Step-Up Gating	Human approval is required before this tool call can proceed.
`stepup_guard_blocked`	403	9	Step-Up Gating	LLM guard model blocked the request during step-up evaluation.
`stepup_destination_blocked`	403	9	Step-Up Gating	Request destination is blocked by step-up policy.
`stepup_unavailable_fail_closed`	503	9	Step-Up Gating	Step-up service unavailable and fail-closed policy is active.
`deepscan_blocked`	403	10	Deep Scan	LLM deep content scan blocked the request.
`deepscan_unavailable_fail_closed`	503	10	Deep Scan	Deep scan service unavailable and fail-closed policy is active.
`ratelimit_exceeded`	429	11	Rate Limiting	Request rate limit exceeded for this identity.
`circuit_open`	503	12	Circuit Breaker	Circuit breaker is open due to repeated backend failures.
`extension_blocked`	403	--	Extension Slot	A registered extension blocked the request.
`extension_unavailable_fail_closed`	503	--	Extension Slot	Extension service unavailable and fail-closed policy is active.
`mcp_invalid_request`	400	--	MCP Validation	The MCP JSON-RPC request is malformed or invalid.
`mcp_transport_failed`	502	--	MCP Transport	Transport-level failure connecting to the MCP tool server.
`mcp_request_failed`	502	--	MCP Transport	MCP server returned a JSON-RPC error.
`mcp_invalid_response`	502	--	MCP Transport	Malformed response received from the MCP tool server.
`contract_validation_failed`	400	--	Contract	Contract validation failed at the plane entry point.

SDK-generated codes

In addition to the gateway-defined codes above, the SDK itself may set code to "unknown" (for unparseable responses), "invalid_response" (for non-JSON bodies), or "jsonrpc_error" (for JSON-RPC error objects that lack a gateway error envelope).

Runtime Helpers

The mcp_gateway_sdk.runtime module provides utility functions that centralize setup code commonly duplicated across agent implementations. All are importable from the top-level package.

`load_dotenv(path=None, *, override=False) -> bool`

Load environment variables from a .env file. Requires the env optional dependency group (python-dotenv).

from mcp_gateway_sdk import load_dotenv

# Load .env from current directory
loaded = load_dotenv()

# Load from a specific path, overriding existing vars
loaded = load_dotenv("/path/to/.env", override=True)

if not loaded:
    print("python-dotenv not installed, skipping .env")

Returns: True if loading was attempted; False if python-dotenv is not installed.

`normalize_model_name(raw_model) -> str`

Normalize a model identifier to a provider-agnostic model name by stripping provider prefixes.

from mcp_gateway_sdk import normalize_model_name

normalize_model_name("groq/llama-3.3-70b-versatile")
# => "llama-3.3-70b-versatile"

normalize_model_name("openai:gpt-4o-mini")
# => "gpt-4o-mini"

normalize_model_name("gpt-4o")
# => "gpt-4o"

`build_spike_token_ref(spike_ref, *, exp_seconds=3600) -> str`

Build a Bearer SPIKE token reference string for gateway model or tool egress. SPIKE tokens are resolved by the gateway at request time.

from mcp_gateway_sdk import build_spike_token_ref

ref = build_spike_token_ref("secrets/groq-api-key")
# => "Bearer $SPIKE{ref:secrets/groq-api-key,exp:3600}"

ref = build_spike_token_ref("secrets/openai-key", exp_seconds=7200)
# => "Bearer $SPIKE{ref:secrets/openai-key,exp:7200}"

`resolve_model_api_key_ref(...) -> str`

Resolve a model API credential as a full SPIKE Bearer token reference. Checks environment variables first, then falls back to function arguments.

from mcp_gateway_sdk import resolve_model_api_key_ref

# Resolution order:
#   1. MODEL_API_KEY_REF env var (explicit full Bearer token reference)
#   2. GROQ_LM_SPIKE_REF env var (converted to Bearer $SPIKE{...})
#   3. Function arguments (spike_ref or model_api_key_ref)

ref = resolve_model_api_key_ref(spike_ref="secrets/groq-api-key")
# => "Bearer $SPIKE{ref:secrets/groq-api-key,exp:3600}"

resolve_model_api_key_ref parameters
Parameter	Type	Default	Description
`model_api_key_ref`	`str`	`""`	Explicit full Bearer token reference.
`spike_ref`	`str`	`""`	SPIKE secret reference (converted via `build_spike_token_ref`).
`exp_seconds`	`int`	`3600`	Token expiry in seconds.
`env`	`dict \| None`	`None`	Environment dict to read from (defaults to `os.environ`).

`setup_observability(...) -> Tracer`

Configure OpenTelemetry tracing and return a Tracer instance. Requires the otel optional dependency group plus the full OpenTelemetry SDK (opentelemetry-sdk, opentelemetry-exporter-otlp-proto-grpc).

from mcp_gateway_sdk import setup_observability

tracer = setup_observability(
    service_name="dspy-researcher",
    service_version="0.1.0",
    spiffe_id="spiffe://poc.local/agents/mcp-client/dspy-researcher/dev",
    session_id="sess-abc123",
    otel_endpoint="http://localhost:4317",
    instrument_dspy=True,  # also instrument DSPy via openinference
)

# Pass the tracer to GatewayClient
client = GatewayClient(url="http://localhost:9090",
                       spiffe_id="...", tracer=tracer)

setup_observability parameters
Parameter	Type	Default	Description
`service_name`	`str`	(required)	OTel resource `service.name`.
`service_version`	`str`	(required)	OTel resource `service.version`.
`spiffe_id`	`str`	(required)	SPIFFE ID stored as OTel resource attribute `spiffe.id`.
`session_id`	`str`	(required)	Session ID stored as OTel resource attribute `session.id`.
`otel_endpoint`	`str`	(required)	OTLP gRPC endpoint (e.g. `"http://localhost:4317"`).
`instrument_dspy`	`bool`	`False`	If `True`, also instruments DSPy via `openinference.instrumentation.dspy`.

`build_dspy_gateway_lm(...) -> dspy.LM`

Build a DSPy LM object configured for gateway-mediated model egress via the OpenAI-compatible endpoint.

from mcp_gateway_sdk import build_dspy_gateway_lm

lm = build_dspy_gateway_lm(
    llm_model="groq/llama-3.3-70b-versatile",
    gateway_url="http://localhost:9090",
    model_provider="groq",
    spike_ref="secrets/groq-api-key",
)

build_dspy_gateway_lm parameters
Parameter	Type	Default	Description
`llm_model`	`str`	(required)	Model identifier (provider prefix is stripped via `normalize_model_name`).
`gateway_url`	`str`	(required)	Gateway base URL.
`model_gateway_base_url`	`str \| None`	`None`	Override the model API base URL. Defaults to `{gateway_url}/openai/v1`.
`model_provider`	`str`	`"groq"`	Sent as `X-Model-Provider` header.
`model_api_key_ref`	`str`	`""`	Explicit Bearer token reference for the API key.
`spike_ref`	`str`	`""`	SPIKE secret reference (resolved via `resolve_model_api_key_ref`).
`compatibility`	`str`	`"openai"`	Compatibility mode. Currently only `"openai"` is supported.

`configure_dspy_gateway_lms(...) -> tuple[LM, LM | None]`

Configure DSPy with a primary LM and an optional reasoning LM (RLM). Calls dspy.configure(lm=lm) automatically.

from mcp_gateway_sdk import configure_dspy_gateway_lms

lm, rlm = configure_dspy_gateway_lms(
    llm_model="groq/llama-3.3-70b-versatile",
    gateway_url="http://localhost:9090",
    model_provider="groq",
    spike_ref="secrets/groq-api-key",
    # Optional reasoning model
    rlm_model="groq/deepseek-r1-distill-llama-70b",
    rlm_provider="groq",
    rlm_spike_ref="secrets/groq-api-key",
)

# lm is the primary model (already configured via dspy.configure)
# rlm is the reasoning model (None if rlm_model not specified)

Accepts all parameters of build_dspy_gateway_lm() for both the primary LM and the RLM (prefixed with rlm_). If rlm_model is empty, the RLM is None.

Framework Recipes

The SDK is framework-agnostic. Below are integration patterns for popular agent frameworks and raw HTTP.

PydanticAI

Wrap GatewayClient.call() in a PydanticAI tool function:

from pydantic_ai import Agent, RunContext
from mcp_gateway_sdk import GatewayClient, GatewayError

client = GatewayClient(
    url="http://localhost:9090",
    spiffe_id="spiffe://poc.local/agents/mcp-client/pydantic-agent/dev",
)

agent = Agent("openai:gpt-4o-mini", system_prompt="You are a research assistant.")

@agent.tool
def search(ctx: RunContext, query: str) -> str:
    """Search the web for information."""
    try:
        result = client.call("tavily_search", query=query, max_results=3)
        # Extract text content from MCP result
        contents = result.get("content", [])
        return "\n".join(c.get("text", "") for c in contents)
    except GatewayError as e:
        return f"Search failed: {e.code}: {e.message}"

DSPy

Use the build_dspy_gateway_lm() helper to route DSPy LM calls through the gateway:

import dspy
from mcp_gateway_sdk import (
    GatewayClient,
    configure_dspy_gateway_lms,
    load_dotenv,
    setup_observability,
)

load_dotenv()

# Configure DSPy LM to use gateway model egress
lm, rlm = configure_dspy_gateway_lms(
    llm_model="groq/llama-3.3-70b-versatile",
    gateway_url="http://localhost:9090",
    model_provider="groq",
    spike_ref="secrets/groq-api-key",
)

# Create a gateway client for tool calls
client = GatewayClient(
    url="http://localhost:9090",
    spiffe_id="spiffe://poc.local/agents/mcp-client/dspy-researcher/dev",
)

# DSPy modules now route through the gateway for both
# model inference and tool invocation
class Researcher(dspy.Module):
    def __init__(self):
        self.generate = dspy.ChainOfThought("question -> answer")

    def forward(self, question):
        # Tool calls go through the gateway
        search_result = client.call("tavily_search", query=question)
        context = str(search_result)
        return self.generate(question=f"{question}\nContext: {context}")

LangGraph

Use GatewayClient as a tool provider within LangGraph nodes:

from mcp_gateway_sdk import GatewayClient, GatewayError

client = GatewayClient(
    url="http://localhost:9090",
    spiffe_id="spiffe://poc.local/agents/mcp-client/langgraph-agent/dev",
)

def search_node(state: dict) -> dict:
    """LangGraph node that calls a gateway tool."""
    query = state["query"]
    try:
        result = client.call("tavily_search", query=query, max_results=5)
        return {"search_results": result, "error": None}
    except GatewayError as e:
        return {"search_results": None, "error": f"{e.code}: {e.message}"}

Raw httpx

For minimal integrations, you can call the gateway directly with httpx. The SDK's GatewayClient wraps exactly this pattern:

import httpx

payload = {
    "jsonrpc": "2.0",
    "method": "tools/call",
    "params": {
        "name": "tavily_search",
        "arguments": {"query": "AI security"}
    },
    "id": 1,
}

headers = {
    "Content-Type": "application/json",
    "X-SPIFFE-ID": "spiffe://poc.local/agents/mcp-client/my-agent/dev",
    "X-Session-ID": "session-uuid-here",
}

resp = httpx.post("http://localhost:9090", json=payload, headers=headers)
if resp.status_code >= 400:
    error = resp.json()
    print(f"Denied: {error['code']}: {error['message']}")
else:
    result = resp.json()["result"]
    print(result)

Wire Format

The gateway speaks JSON-RPC 2.0 over HTTP POST. All tool calls go to the gateway base URL. Model calls go to the /openai/v1/chat/completions endpoint.

Required headers

Required HTTP headers for gateway requests
Header	Value	Description
`Content-Type`	`application/json`	All requests must be JSON.
`X-SPIFFE-ID`	`spiffe://<trust-domain>/<path>`	Caller's SPIFFE identity. Required by the auth middleware (step 3).
`X-Session-ID`	UUID string	Session identifier for session-context tracking (step 8).

JSON-RPC request

{
  "jsonrpc": "2.0",
  "method": "tools/call",
  "params": {
    "name": "tavily_search",
    "arguments": {
      "query": "AI security best practices",
      "max_results": 5
    }
  },
  "id": 1
}

JSON-RPC success response

{
  "jsonrpc": "2.0",
  "result": {
    "content": [
      {
        "type": "text",
        "text": "AI security involves..."
      }
    ]
  },
  "id": 1
}

Gateway error response (HTTP 4xx/5xx)

{
  "code": "authz_policy_denied",
  "message": "OPA policy denied access to tool 'read' for identity ...",
  "reason_code": "tool_not_permitted",
  "middleware": "opa",
  "middleware_step": 6,
  "decision_id": "d-f4a21b3c",
  "trace_id": "abc123def456",
  "details": {},
  "remediation": "Contact your policy administrator to grant access.",
  "docs_url": "https://precinct.dev/pages/opa.html"
}

Model egress request headers

Model calls via call_model_chat() send additional headers:

Additional headers for model egress requests
Header	Description
`X-Model-Provider`	Model provider name (e.g. `"groq"`).
`X-Residency-Intent`	Data residency intent (e.g. `"us"`).
`X-Budget-Profile`	Budget profile (e.g. `"standard"`).
`Authorization`	SPIKE token reference (e.g. `Bearer $SPIKE{ref:...,exp:3600}`).

Retry Behavior

The SDK retries only on HTTP 503 (Service Unavailable) responses. All other error status codes (400, 401, 403, 429, 502, etc.) are raised immediately without retry.

Backoff formula

delay = backoff_base * 2^attempt

With defaults (backoff_base=1.0, max_retries=3):
  Attempt 0: immediate (first try)
  Attempt 1: 1.0s delay  (1.0 * 2^0)
  Attempt 2: 2.0s delay  (1.0 * 2^1)
  Attempt 3: 4.0s delay  (1.0 * 2^2)
  Total max wait: 7.0s across 4 attempts

After all retries are exhausted, the last GatewayError (with http_status=503) is raised.

503 error codes that trigger retries

The following gateway error codes produce HTTP 503 and will be retried: dlp_unavailable_fail_closed, stepup_unavailable_fail_closed, deepscan_unavailable_fail_closed, circuit_open, extension_unavailable_fail_closed.

Customizing retry behavior

# Aggressive retries for high-availability scenarios
client = GatewayClient(
    url="http://localhost:9090",
    spiffe_id="spiffe://poc.local/agents/ha-agent/dev",
    max_retries=5,
    backoff_base=0.5,   # start at 0.5s, then 1s, 2s, 4s, 8s
    timeout=60.0,       # longer timeout per request
)

# No retries (fail immediately on 503)
client = GatewayClient(
    url="http://localhost:9090",
    spiffe_id="spiffe://poc.local/agents/fast-fail/dev",
    max_retries=0,
)

OpenTelemetry

When you pass a tracer to GatewayClient, every call() invocation creates an OTel span with structured attributes for distributed trace correlation.

Span details

OpenTelemetry span attributes
Property	Value
Span name	`gateway.tool_call.<tool_name>`
`mcp.method`	`"tools/call"`
`mcp.tool.name`	The tool name (e.g. `"tavily_search"`)
`mcp.tool.arguments`	JSON-serialized arguments
`spiffe.id`	Caller's SPIFFE identity
`session.id`	Session identifier
`mcp.result.success`	`True` or `False`
`mcp.error.code`	Error code (on failure only)
`mcp.error.http_status`	HTTP status (on failure only)

Full setup example

from mcp_gateway_sdk import GatewayClient, setup_observability

# Set up OTel tracing
tracer = setup_observability(
    service_name="my-agent",
    service_version="1.0.0",
    spiffe_id="spiffe://poc.local/agents/my-agent/dev",
    session_id="sess-123",
    otel_endpoint="http://localhost:4317",
)

# All call() invocations now produce OTel spans
with GatewayClient(
    url="http://localhost:9090",
    spiffe_id="spiffe://poc.local/agents/my-agent/dev",
    tracer=tracer,
) as client:
    result = client.call("tavily_search", query="test")

Note on call_rpc and call_model_chat

OTel span creation is currently implemented only in the call() method. The call_rpc() and call_model_chat() methods do not create spans.

Logging

The SDK uses Python's standard logging module with the logger name "mcp_gateway_sdk".

Log levels

SDK log levels and messages
Level	Event	Description
`WARNING`	503 retry	Logged on each retry attempt. Includes RPC name, attempt number, backoff delay, and error code.
`ERROR`	503 exhausted	Logged when all retries are exhausted. Includes RPC name, total attempts, and error code.

Enabling SDK logs

import logging

# Show all SDK log messages
logging.getLogger("mcp_gateway_sdk").setLevel(logging.DEBUG)

# Or configure via basicConfig
logging.basicConfig(level=logging.WARNING)

# Example output on 503 retry:
# WARNING:mcp_gateway_sdk:RPC tavily_search returned 503 (attempt 1/4).
#   Retrying in 1.0s. Code: circuit_open

Python SDK Reference

Overview

Installation

Core install

Optional dependency groups

Quick Start

GatewayClient API Reference

Constructor

Methods

call(tool_name, **params) -> Any

call_rpc(method, params=None) -> Any

call_model_chat(...) -> Any

close() -> None

Property: session_id

GatewayError

Attributes

Class method: from_response(http_status, body)

Error handling patterns

Error Code Catalog

Runtime Helpers

load_dotenv(path=None, *, override=False) -> bool

normalize_model_name(raw_model) -> str

build_spike_token_ref(spike_ref, *, exp_seconds=3600) -> str

resolve_model_api_key_ref(...) -> str

setup_observability(...) -> Tracer

build_dspy_gateway_lm(...) -> dspy.LM

configure_dspy_gateway_lms(...) -> tuple[LM, LM | None]

Framework Recipes

PydanticAI

DSPy

LangGraph

Raw httpx

Wire Format

Required headers

JSON-RPC request

JSON-RPC success response

Gateway error response (HTTP 4xx/5xx)

Model egress request headers

Retry Behavior

Backoff formula

Customizing retry behavior

OpenTelemetry

Span details

Full setup example

Logging

Log levels

Enabling SDK logs

`call(tool_name, **params) -> Any`

`call_rpc(method, params=None) -> Any`

`call_model_chat(...) -> Any`

`close() -> None`

Property: `session_id`

Class method: `from_response(http_status, body)`

`load_dotenv(path=None, *, override=False) -> bool`

`normalize_model_name(raw_model) -> str`

`build_spike_token_ref(spike_ref, *, exp_seconds=3600) -> str`

`resolve_model_api_key_ref(...) -> str`

`setup_observability(...) -> Tracer`

`build_dspy_gateway_lm(...) -> dspy.LM`

`configure_dspy_gateway_lms(...) -> tuple[LM, LM | None]`