Python SDK Reference
Complete API reference for mcp-gateway-sdk (v0.1.0), the
PRECINCT Python SDK for integrating agents with the security gateway. Built
on httpx, requires Python >=3.10, and works with any agent
framework or raw HTTP.
Start with the Integration Guide to register your agent's SPIFFE identity and configure gateway policies before using the SDK. The SDKs overview page covers both the Python and Go SDKs side by side.
Overview
mcp-gateway-sdk provides a single GatewayClient
class that handles all communication with the PRECINCT gateway. It
constructs MCP JSON-RPC envelopes, injects SPIFFE identity headers,
maps gateway error responses to structured Python exceptions, and retries
on transient 503 failures with exponential backoff.
The SDK is framework-independent and works with:
- PydanticAI, via tool wrapper functions
- DSPy, via
build_dspy_gateway_lm()runtime helper - LangGraph / LangChain, via tool functions or raw calls
- Raw httpx, for custom or minimal integrations
Public API exports (from mcp_gateway_sdk):
from mcp_gateway_sdk import (
GatewayClient,
GatewayError,
build_dspy_gateway_lm,
build_spike_token_ref,
configure_dspy_gateway_lms,
load_dotenv,
normalize_model_name,
resolve_model_api_key_ref,
setup_observability,
)
Installation
Core install
# With pip (from local checkout)
cd POC/sdk/python
pip install -e .
# With uv
uv pip install -e POC/sdk/python
Optional dependency groups
The SDK defines three optional dependency groups in
pyproject.toml. Install them with bracket syntax:
| Group | Install command | Packages | Purpose |
|---|---|---|---|
env |
pip install -e ".[env]" |
python-dotenv>=1.0.1 |
Enable load_dotenv() helper |
otel |
pip install -e ".[otel]" |
opentelemetry-api>=1.39.0 |
Enable setup_observability() and tracer wiring |
dev |
pip install -e ".[dev]" |
pytest>=9.0.0, httpx>=0.28.0 |
Development and testing |
# Install all optional groups at once
pip install -e ".[env,otel,dev]"
The SDK is part of the PRECINCT proof-of-concept and is versioned alongside the main repository (currently v0.1.0). For production use, pin to a specific commit or tag.
Quick Start
The GatewayClient supports both context-manager and
manual lifecycle patterns. The context manager ensures the underlying
httpx.Client is properly closed:
from mcp_gateway_sdk import GatewayClient, GatewayError
with GatewayClient(
url="http://localhost:9090",
spiffe_id="spiffe://poc.local/agents/mcp-client/dspy-researcher/dev",
) as client:
try:
result = client.call("tavily_search", query="AI security", max_results=5)
print(result) # raw MCP JSON-RPC result dict
except GatewayError as e:
print(f"Denied: {e.code} - {e.message}")
print(f" Middleware: {e.middleware} (step {e.step})")
print(f" Remediation: {e.remediation}")
Manual lifecycle (equivalent, but you must call close()
yourself):
client = GatewayClient(
url="http://localhost:9090",
spiffe_id="spiffe://poc.local/agents/mcp-client/dspy-researcher/dev",
)
try:
result = client.call("tavily_search", query="AI security")
print(result)
except GatewayError as e:
print(f"Denied: {e.code}: {e.remediation}")
finally:
client.close()
GatewayClient API Reference
mcp_gateway_sdk.GatewayClient is the primary class for
interacting with the gateway. It is a synchronous, thread-safe client
built on httpx.Client.
Constructor
GatewayClient(
url: str,
spiffe_id: str,
*,
session_id: str | None = None,
tracer: Any = None,
timeout: float = 30.0,
max_retries: int = 3,
backoff_base: float = 1.0,
)
| Parameter | Type | Default | Description |
|---|---|---|---|
url |
str |
(required) | Gateway base URL, e.g. "http://localhost:9090". |
spiffe_id |
str |
(required) | SPIFFE identity string sent as the X-SPIFFE-ID header on every request. |
session_id |
str | None |
None |
Session identifier for X-Session-ID. Auto-generated UUID if omitted. |
tracer |
Any |
None |
Optional OpenTelemetry Tracer for span creation around tool calls. |
timeout |
float |
30.0 |
HTTP request timeout in seconds. Passed to the underlying httpx.Client. |
max_retries |
int |
3 |
Maximum retry attempts for 503 responses. Total attempts = max_retries + 1. |
backoff_base |
float |
1.0 |
Base value for exponential backoff (seconds). Delay = backoff_base * 2^attempt. |
Methods
call(tool_name, **params) -> Any
Call an MCP tool through the gateway using the tools/call
JSON-RPC method. This is the primary method for tool invocation.
# Call a search tool
result = client.call("tavily_search", query="AI security", max_results=5)
# Call a file-read tool
content = client.call("read", path="/etc/hostname")
- Args:
tool_name(str), the MCP tool name;**params, keyword arguments passed asparams.argumentsin the JSON-RPC envelope. - Returns: the
resultfield from the JSON-RPC response (dict or value). - Raises:
GatewayErroron 4xx/5xx responses or JSON-RPC errors;httpx.ConnectErrorif the gateway is unreachable.
When a tracer is configured, call() creates an
OTel span named gateway.tool_call.<tool_name> with
attributes for mcp.method, mcp.tool.name,
spiffe.id, and session.id.
call_rpc(method, params=None) -> Any
Call a raw MCP JSON-RPC method through the gateway. Use this for
protocol-level methods like tools/list or
resources/read. For tool invocations, prefer
call().
# List available tools
tools = client.call_rpc("tools/list")
# Read a resource
resource = client.call_rpc("resources/read", {"uri": "file:///data/config.yaml"})
call_model_chat(...) -> Any
Call the gateway's OpenAI-compatible model egress endpoint. This keeps model calls behind the gateway's model-plane controls (DLP, rate limiting, deep scan) while providing a simple SDK interface.
response = client.call_model_chat(
model="llama-3.3-70b-versatile",
messages=[{"role": "user", "content": "Summarize this document."}],
provider="groq",
api_key_ref="Bearer $SPIKE{ref:secrets/groq-api-key,exp:3600}",
)
| Parameter | Type | Default | Description |
|---|---|---|---|
model |
str |
(required) | Model identifier (e.g. "llama-3.3-70b-versatile"). |
messages |
list[dict] |
(required) | OpenAI-format messages array. |
provider |
str |
"groq" |
Model provider, sent as X-Model-Provider header. |
api_key_ref |
str | None |
None |
API key reference (typically a SPIKE token ref). Sent as the Authorization header. |
api_key_header |
str |
"Authorization" |
Header name for the API key. |
endpoint |
str |
"/openai/v1/chat/completions" |
Gateway model egress path. |
residency_intent |
str |
"us" |
Data residency intent, sent as X-Residency-Intent. |
budget_profile |
str |
"standard" |
Budget profile, sent as X-Budget-Profile. |
extra_headers |
dict | None |
None |
Additional headers merged into the request. |
**extra_payload |
Any |
-- | Additional keys merged into the JSON request body (e.g. temperature, max_tokens). |
close() -> None
Close the underlying httpx.Client. Called automatically when
using the context manager (with statement).
Property: session_id
The session identifier sent as X-Session-ID on every request.
Set at construction time (auto-generated UUID if not provided). Read-only
after construction. Access via client.session_id.
GatewayError
mcp_gateway_sdk.GatewayError is the structured exception type
raised for all gateway denials. It mirrors the unified JSON error envelope
defined by the gateway (Go struct middleware.GatewayError).
Attributes
| Attribute | Type | Default | Description |
|---|---|---|---|
code |
str |
"" |
Machine-readable error code (e.g. "authz_policy_denied"). |
message |
str |
"" |
Human-readable description of the denial. |
reason_code |
str |
"" |
Stable reason identifier for policy or UI handling. |
middleware |
str |
"" |
Which middleware layer rejected the request (e.g. "opa", "dlp"). |
step |
int |
0 |
Middleware step number in the chain (maps to middleware_step in the JSON envelope). |
decision_id |
str |
"" |
Audit decision ID for cross-referencing with gateway logs. |
trace_id |
str |
"" |
OpenTelemetry trace ID for distributed tracing correlation. |
details |
dict[str, Any] |
{} |
Optional structured details (risk scores, matched patterns, etc.). |
remediation |
str |
"" |
Optional remediation guidance for the caller. |
docs_url |
str |
"" |
Optional link to documentation for the error. |
http_status |
int |
0 |
HTTP status code from the gateway response. |
Class method: from_response(http_status, body)
Parse a GatewayError from an HTTP response body dict. This
is called internally by GatewayClient when the gateway returns
a 4xx or 5xx status.
# Internal usage (shown for reference):
error = GatewayError.from_response(
http_status=403,
body={
"code": "authz_policy_denied",
"message": "OPA policy denied access to tool 'read'",
"reason_code": "tool_not_permitted",
"middleware": "opa",
"middleware_step": 6,
"decision_id": "d-abc123",
"trace_id": "t-xyz789",
"remediation": "Request access via your team's OPA policy admin.",
"docs_url": "https://precinct.dev/pages/opa.html",
},
)
Error handling patterns
from mcp_gateway_sdk import GatewayClient, GatewayError
with GatewayClient(url="http://localhost:9090",
spiffe_id="spiffe://poc.local/agents/my-agent/dev") as client:
try:
result = client.call("sensitive_tool", data="payload")
except GatewayError as e:
# Branch on the machine-readable error code
if e.code == "authz_policy_denied":
print(f"Policy denied (decision: {e.decision_id})")
elif e.code == "dlp_credentials_detected":
print(f"DLP blocked credentials in request")
elif e.code == "ratelimit_exceeded":
print(f"Rate limited (HTTP {e.http_status})")
elif e.code == "circuit_open":
print(f"Circuit breaker open for backend")
else:
print(f"Gateway error: {e.code}: {e.message}")
# All errors carry the trace ID for correlation
if e.trace_id:
print(f" Trace: {e.trace_id}")
if e.remediation:
print(f" Fix: {e.remediation}")
Error Code Catalog
The gateway defines a fixed set of machine-readable error codes. Each code
identifies the middleware layer that rejected the request and maps to a
specific HTTP status. The GatewayError.code attribute will be
one of these values.
| Error Code | HTTP | Step | Middleware | Description |
|---|---|---|---|---|
request_too_large |
413 | 1 | Request Size | Request body exceeds the configured size limit. |
auth_missing_identity |
401 | 3 | SPIFFE Auth | No X-SPIFFE-ID header present. |
auth_invalid_identity |
401 | 3 | SPIFFE Auth | The X-SPIFFE-ID value is malformed or unrecognized. |
registry_tool_unknown |
403 | 5 | Tool Registry | The requested tool is not in the gateway's tool registry. |
registry_hash_mismatch |
403 | 5 | Tool Registry | Tool definition hash does not match the registered hash (rug-pull detection). |
authz_policy_denied |
403 | 6 | OPA Policy | OPA policy evaluation denied the request. |
authz_no_matching_grant |
403 | 6 | OPA Policy | No policy grant matches this identity/tool combination. |
authz_tool_not_found |
403 | 6 | OPA Policy | Tool not found during OPA evaluation (distinct from registry check). |
dlp_credentials_detected |
403 | 7 | DLP | DLP scanner detected credentials (API keys, tokens) in the request. |
dlp_injection_blocked |
403 | 7 | DLP | DLP scanner blocked a prompt injection attempt (policy = block). |
dlp_pii_blocked |
403 | 7 | DLP | DLP scanner blocked PII (policy = block). |
dlp_unavailable_fail_closed |
503 | 7 | DLP | DLP scanner is unavailable and fail-closed policy is active. |
exfiltration_detected |
403 | 8 | Session Context | Data exfiltration pattern detected across session context. |
stepup_denied |
403 | 9 | Step-Up Gating | Step-up verification denied the request. |
stepup_approval_required |
403 | 9 | Step-Up Gating | Human approval is required before this tool call can proceed. |
stepup_guard_blocked |
403 | 9 | Step-Up Gating | LLM guard model blocked the request during step-up evaluation. |
stepup_destination_blocked |
403 | 9 | Step-Up Gating | Request destination is blocked by step-up policy. |
stepup_unavailable_fail_closed |
503 | 9 | Step-Up Gating | Step-up service unavailable and fail-closed policy is active. |
deepscan_blocked |
403 | 10 | Deep Scan | LLM deep content scan blocked the request. |
deepscan_unavailable_fail_closed |
503 | 10 | Deep Scan | Deep scan service unavailable and fail-closed policy is active. |
ratelimit_exceeded |
429 | 11 | Rate Limiting | Request rate limit exceeded for this identity. |
circuit_open |
503 | 12 | Circuit Breaker | Circuit breaker is open due to repeated backend failures. |
extension_blocked |
403 | -- | Extension Slot | A registered extension blocked the request. |
extension_unavailable_fail_closed |
503 | -- | Extension Slot | Extension service unavailable and fail-closed policy is active. |
mcp_invalid_request |
400 | -- | MCP Validation | The MCP JSON-RPC request is malformed or invalid. |
mcp_transport_failed |
502 | -- | MCP Transport | Transport-level failure connecting to the MCP tool server. |
mcp_request_failed |
502 | -- | MCP Transport | MCP server returned a JSON-RPC error. |
mcp_invalid_response |
502 | -- | MCP Transport | Malformed response received from the MCP tool server. |
contract_validation_failed |
400 | -- | Contract | Contract validation failed at the plane entry point. |
In addition to the gateway-defined codes above, the SDK itself may set
code to "unknown" (for unparseable responses),
"invalid_response" (for non-JSON bodies), or
"jsonrpc_error" (for JSON-RPC error objects that lack a
gateway error envelope).
Runtime Helpers
The mcp_gateway_sdk.runtime module provides utility functions
that centralize setup code commonly duplicated across agent implementations.
All are importable from the top-level package.
load_dotenv(path=None, *, override=False) -> bool
Load environment variables from a .env file. Requires the
env optional dependency group (python-dotenv).
from mcp_gateway_sdk import load_dotenv
# Load .env from current directory
loaded = load_dotenv()
# Load from a specific path, overriding existing vars
loaded = load_dotenv("/path/to/.env", override=True)
if not loaded:
print("python-dotenv not installed, skipping .env")
- Returns:
Trueif loading was attempted;Falseifpython-dotenvis not installed.
normalize_model_name(raw_model) -> str
Normalize a model identifier to a provider-agnostic model name by stripping provider prefixes.
from mcp_gateway_sdk import normalize_model_name
normalize_model_name("groq/llama-3.3-70b-versatile")
# => "llama-3.3-70b-versatile"
normalize_model_name("openai:gpt-4o-mini")
# => "gpt-4o-mini"
normalize_model_name("gpt-4o")
# => "gpt-4o"
build_spike_token_ref(spike_ref, *, exp_seconds=3600) -> str
Build a Bearer SPIKE token reference string for gateway model or tool egress. SPIKE tokens are resolved by the gateway at request time.
from mcp_gateway_sdk import build_spike_token_ref
ref = build_spike_token_ref("secrets/groq-api-key")
# => "Bearer $SPIKE{ref:secrets/groq-api-key,exp:3600}"
ref = build_spike_token_ref("secrets/openai-key", exp_seconds=7200)
# => "Bearer $SPIKE{ref:secrets/openai-key,exp:7200}"
resolve_model_api_key_ref(...) -> str
Resolve a model API credential as a full SPIKE Bearer token reference. Checks environment variables first, then falls back to function arguments.
from mcp_gateway_sdk import resolve_model_api_key_ref
# Resolution order:
# 1. MODEL_API_KEY_REF env var (explicit full Bearer token reference)
# 2. GROQ_LM_SPIKE_REF env var (converted to Bearer $SPIKE{...})
# 3. Function arguments (spike_ref or model_api_key_ref)
ref = resolve_model_api_key_ref(spike_ref="secrets/groq-api-key")
# => "Bearer $SPIKE{ref:secrets/groq-api-key,exp:3600}"
| Parameter | Type | Default | Description |
|---|---|---|---|
model_api_key_ref |
str |
"" |
Explicit full Bearer token reference. |
spike_ref |
str |
"" |
SPIKE secret reference (converted via build_spike_token_ref). |
exp_seconds |
int |
3600 |
Token expiry in seconds. |
env |
dict | None |
None |
Environment dict to read from (defaults to os.environ). |
setup_observability(...) -> Tracer
Configure OpenTelemetry tracing and return a Tracer instance.
Requires the otel optional dependency group plus the full
OpenTelemetry SDK (opentelemetry-sdk,
opentelemetry-exporter-otlp-proto-grpc).
from mcp_gateway_sdk import setup_observability
tracer = setup_observability(
service_name="dspy-researcher",
service_version="0.1.0",
spiffe_id="spiffe://poc.local/agents/mcp-client/dspy-researcher/dev",
session_id="sess-abc123",
otel_endpoint="http://localhost:4317",
instrument_dspy=True, # also instrument DSPy via openinference
)
# Pass the tracer to GatewayClient
client = GatewayClient(url="http://localhost:9090",
spiffe_id="...", tracer=tracer)
| Parameter | Type | Default | Description |
|---|---|---|---|
service_name |
str |
(required) | OTel resource service.name. |
service_version |
str |
(required) | OTel resource service.version. |
spiffe_id |
str |
(required) | SPIFFE ID stored as OTel resource attribute spiffe.id. |
session_id |
str |
(required) | Session ID stored as OTel resource attribute session.id. |
otel_endpoint |
str |
(required) | OTLP gRPC endpoint (e.g. "http://localhost:4317"). |
instrument_dspy |
bool |
False |
If True, also instruments DSPy via openinference.instrumentation.dspy. |
build_dspy_gateway_lm(...) -> dspy.LM
Build a DSPy LM object configured for gateway-mediated model
egress via the OpenAI-compatible endpoint.
from mcp_gateway_sdk import build_dspy_gateway_lm
lm = build_dspy_gateway_lm(
llm_model="groq/llama-3.3-70b-versatile",
gateway_url="http://localhost:9090",
model_provider="groq",
spike_ref="secrets/groq-api-key",
)
| Parameter | Type | Default | Description |
|---|---|---|---|
llm_model |
str |
(required) | Model identifier (provider prefix is stripped via normalize_model_name). |
gateway_url |
str |
(required) | Gateway base URL. |
model_gateway_base_url |
str | None |
None |
Override the model API base URL. Defaults to {gateway_url}/openai/v1. |
model_provider |
str |
"groq" |
Sent as X-Model-Provider header. |
model_api_key_ref |
str |
"" |
Explicit Bearer token reference for the API key. |
spike_ref |
str |
"" |
SPIKE secret reference (resolved via resolve_model_api_key_ref). |
compatibility |
str |
"openai" |
Compatibility mode. Currently only "openai" is supported. |
configure_dspy_gateway_lms(...) -> tuple[LM, LM | None]
Configure DSPy with a primary LM and an optional reasoning LM (RLM).
Calls dspy.configure(lm=lm) automatically.
from mcp_gateway_sdk import configure_dspy_gateway_lms
lm, rlm = configure_dspy_gateway_lms(
llm_model="groq/llama-3.3-70b-versatile",
gateway_url="http://localhost:9090",
model_provider="groq",
spike_ref="secrets/groq-api-key",
# Optional reasoning model
rlm_model="groq/deepseek-r1-distill-llama-70b",
rlm_provider="groq",
rlm_spike_ref="secrets/groq-api-key",
)
# lm is the primary model (already configured via dspy.configure)
# rlm is the reasoning model (None if rlm_model not specified)
Accepts all parameters of build_dspy_gateway_lm() for both
the primary LM and the RLM (prefixed with rlm_). If
rlm_model is empty, the RLM is None.
Framework Recipes
The SDK is framework-agnostic. Below are integration patterns for popular agent frameworks and raw HTTP.
PydanticAI
Wrap GatewayClient.call() in a PydanticAI tool function:
from pydantic_ai import Agent, RunContext
from mcp_gateway_sdk import GatewayClient, GatewayError
client = GatewayClient(
url="http://localhost:9090",
spiffe_id="spiffe://poc.local/agents/mcp-client/pydantic-agent/dev",
)
agent = Agent("openai:gpt-4o-mini", system_prompt="You are a research assistant.")
@agent.tool
def search(ctx: RunContext, query: str) -> str:
"""Search the web for information."""
try:
result = client.call("tavily_search", query=query, max_results=3)
# Extract text content from MCP result
contents = result.get("content", [])
return "\n".join(c.get("text", "") for c in contents)
except GatewayError as e:
return f"Search failed: {e.code}: {e.message}"
DSPy
Use the build_dspy_gateway_lm() helper to route DSPy LM
calls through the gateway:
import dspy
from mcp_gateway_sdk import (
GatewayClient,
configure_dspy_gateway_lms,
load_dotenv,
setup_observability,
)
load_dotenv()
# Configure DSPy LM to use gateway model egress
lm, rlm = configure_dspy_gateway_lms(
llm_model="groq/llama-3.3-70b-versatile",
gateway_url="http://localhost:9090",
model_provider="groq",
spike_ref="secrets/groq-api-key",
)
# Create a gateway client for tool calls
client = GatewayClient(
url="http://localhost:9090",
spiffe_id="spiffe://poc.local/agents/mcp-client/dspy-researcher/dev",
)
# DSPy modules now route through the gateway for both
# model inference and tool invocation
class Researcher(dspy.Module):
def __init__(self):
self.generate = dspy.ChainOfThought("question -> answer")
def forward(self, question):
# Tool calls go through the gateway
search_result = client.call("tavily_search", query=question)
context = str(search_result)
return self.generate(question=f"{question}\nContext: {context}")
LangGraph
Use GatewayClient as a tool provider within LangGraph nodes:
from mcp_gateway_sdk import GatewayClient, GatewayError
client = GatewayClient(
url="http://localhost:9090",
spiffe_id="spiffe://poc.local/agents/mcp-client/langgraph-agent/dev",
)
def search_node(state: dict) -> dict:
"""LangGraph node that calls a gateway tool."""
query = state["query"]
try:
result = client.call("tavily_search", query=query, max_results=5)
return {"search_results": result, "error": None}
except GatewayError as e:
return {"search_results": None, "error": f"{e.code}: {e.message}"}
Raw httpx
For minimal integrations, you can call the gateway directly with
httpx. The SDK's GatewayClient wraps exactly
this pattern:
import httpx
payload = {
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "tavily_search",
"arguments": {"query": "AI security"}
},
"id": 1,
}
headers = {
"Content-Type": "application/json",
"X-SPIFFE-ID": "spiffe://poc.local/agents/mcp-client/my-agent/dev",
"X-Session-ID": "session-uuid-here",
}
resp = httpx.post("http://localhost:9090", json=payload, headers=headers)
if resp.status_code >= 400:
error = resp.json()
print(f"Denied: {error['code']}: {error['message']}")
else:
result = resp.json()["result"]
print(result)
Wire Format
The gateway speaks JSON-RPC 2.0 over HTTP POST. All tool
calls go to the gateway base URL. Model calls go to the
/openai/v1/chat/completions endpoint.
Required headers
| Header | Value | Description |
|---|---|---|
Content-Type |
application/json |
All requests must be JSON. |
X-SPIFFE-ID |
spiffe://<trust-domain>/<path> |
Caller's SPIFFE identity. Required by the auth middleware (step 3). |
X-Session-ID |
UUID string | Session identifier for session-context tracking (step 8). |
JSON-RPC request
{
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": "tavily_search",
"arguments": {
"query": "AI security best practices",
"max_results": 5
}
},
"id": 1
}
JSON-RPC success response
{
"jsonrpc": "2.0",
"result": {
"content": [
{
"type": "text",
"text": "AI security involves..."
}
]
},
"id": 1
}
Gateway error response (HTTP 4xx/5xx)
{
"code": "authz_policy_denied",
"message": "OPA policy denied access to tool 'read' for identity ...",
"reason_code": "tool_not_permitted",
"middleware": "opa",
"middleware_step": 6,
"decision_id": "d-f4a21b3c",
"trace_id": "abc123def456",
"details": {},
"remediation": "Contact your policy administrator to grant access.",
"docs_url": "https://precinct.dev/pages/opa.html"
}
Model egress request headers
Model calls via call_model_chat() send additional headers:
| Header | Description |
|---|---|
X-Model-Provider |
Model provider name (e.g. "groq"). |
X-Residency-Intent |
Data residency intent (e.g. "us"). |
X-Budget-Profile |
Budget profile (e.g. "standard"). |
Authorization |
SPIKE token reference (e.g. Bearer $SPIKE{ref:...,exp:3600}). |
Retry Behavior
The SDK retries only on HTTP 503 (Service Unavailable) responses. All other error status codes (400, 401, 403, 429, 502, etc.) are raised immediately without retry.
Backoff formula
delay = backoff_base * 2^attempt
With defaults (backoff_base=1.0, max_retries=3):
Attempt 0: immediate (first try)
Attempt 1: 1.0s delay (1.0 * 2^0)
Attempt 2: 2.0s delay (1.0 * 2^1)
Attempt 3: 4.0s delay (1.0 * 2^2)
Total max wait: 7.0s across 4 attempts
After all retries are exhausted, the last GatewayError (with
http_status=503) is raised.
The following gateway error codes produce HTTP 503 and will be retried:
dlp_unavailable_fail_closed,
stepup_unavailable_fail_closed,
deepscan_unavailable_fail_closed,
circuit_open,
extension_unavailable_fail_closed.
Customizing retry behavior
# Aggressive retries for high-availability scenarios
client = GatewayClient(
url="http://localhost:9090",
spiffe_id="spiffe://poc.local/agents/ha-agent/dev",
max_retries=5,
backoff_base=0.5, # start at 0.5s, then 1s, 2s, 4s, 8s
timeout=60.0, # longer timeout per request
)
# No retries (fail immediately on 503)
client = GatewayClient(
url="http://localhost:9090",
spiffe_id="spiffe://poc.local/agents/fast-fail/dev",
max_retries=0,
)
OpenTelemetry
When you pass a tracer to GatewayClient,
every call() invocation creates an OTel span with structured
attributes for distributed trace correlation.
Span details
| Property | Value |
|---|---|
| Span name | gateway.tool_call.<tool_name> |
mcp.method |
"tools/call" |
mcp.tool.name |
The tool name (e.g. "tavily_search") |
mcp.tool.arguments |
JSON-serialized arguments |
spiffe.id |
Caller's SPIFFE identity |
session.id |
Session identifier |
mcp.result.success |
True or False |
mcp.error.code |
Error code (on failure only) |
mcp.error.http_status |
HTTP status (on failure only) |
Full setup example
from mcp_gateway_sdk import GatewayClient, setup_observability
# Set up OTel tracing
tracer = setup_observability(
service_name="my-agent",
service_version="1.0.0",
spiffe_id="spiffe://poc.local/agents/my-agent/dev",
session_id="sess-123",
otel_endpoint="http://localhost:4317",
)
# All call() invocations now produce OTel spans
with GatewayClient(
url="http://localhost:9090",
spiffe_id="spiffe://poc.local/agents/my-agent/dev",
tracer=tracer,
) as client:
result = client.call("tavily_search", query="test")
OTel span creation is currently implemented only in the
call() method. The call_rpc() and
call_model_chat() methods do not create spans.
Logging
The SDK uses Python's standard logging module with the
logger name "mcp_gateway_sdk".
Log levels
| Level | Event | Description |
|---|---|---|
WARNING |
503 retry | Logged on each retry attempt. Includes RPC name, attempt number, backoff delay, and error code. |
ERROR |
503 exhausted | Logged when all retries are exhausted. Includes RPC name, total attempts, and error code. |
Enabling SDK logs
import logging
# Show all SDK log messages
logging.getLogger("mcp_gateway_sdk").setLevel(logging.DEBUG)
# Or configure via basicConfig
logging.basicConfig(level=logging.WARNING)
# Example output on 503 retry:
# WARNING:mcp_gateway_sdk:RPC tavily_search returned 503 (attempt 1/4).
# Retrying in 1.0s. Code: circuit_open