Use sampling with your AgentCore gateway
Sampling is an MCP feature that allows an MCP server to request an LLM completion from the client during a tool call. This enables servers to leverage AI capabilities without needing direct access to a language model — the client handles the model invocation and returns the result. AgentCore Gateway forwards sampling requests from MCP server targets to your clients, replacing the request id with a gateway-generated identifier.
Prerequisites
To use sampling with your gateway:
-
Sessions enabled — Sampling requires session support. See Use MCP sessions with your gateway.
-
Response streaming enabled — Sampling requests are sent as SSE chunks during an open connection. Set
streamingConfiguration.enableResponseStreamingtotruein your gateway’sprotocolConfiguration.mcp. -
MCP server target type — Sampling requests originate from MCP server targets.
-
Client declares sampling capability — The client must declare support for sampling during the
initializerequest. The gateway only forwards sampling requests to clients that declared this capability.
How sampling works
When an MCP server target needs an LLM completion during tool execution, it sends a sampling/createMessage request. The gateway forwards this request to the client as an SSE event, replacing the request id. The client invokes its language model and sends the result back to the gateway, which forwards it to the target.
The sampling request includes:
-
messages— The conversation messages to send to the model. -
modelPreferences— Optional hints about desired model capabilities (intelligence, speed, cost). -
systemPrompt— Optional system prompt for the model. -
maxTokens— Maximum number of tokens to generate.
The client responds with:
-
model— The model that was used. -
role— Alwaysassistant. -
content— The generated content (text or image).
Note
The client has full control over which model to use and how to handle the request. The server’s modelPreferences are hints, not requirements. The client may also modify or reject the request based on its own policies.
Sampling flow
-
Client sends a
tools/callrequest with theMcp-Session-Idheader. -
Gateway forwards the tool call to the MCP server target.
-
The target opens an SSE stream and sends a
sampling/createMessagerequest. -
Gateway forwards the sampling request to the client as an SSE event, replacing the request
id. -
The client invokes its language model with the provided messages.
-
The client sends a new request with the sampling result using the same
Mcp-Session-Idand theidfrom the gateway’s request. -
Gateway forwards the result to the MCP server target.
-
The target continues processing and returns the final tool result.
-
Gateway forwards the final result to the client and closes the stream.
Guidance for MCP server target developers
Important
MCP server targets that send sampling requests should wrap sampling calls in try-catch blocks and handle the case where the client does not support sampling. If the gateway’s client did not declare sampling capability, the gateway does not declare it to the target. If the target sends a sampling request anyway, the gateway returns a -32601 (Method not found) error to the target.
Servers should implement a fallback path (such as using a built-in model or skipping the AI-assisted step) when sampling is not available.
Error handling
| Scenario | Error | Description |
|---|---|---|
|
Client sends a sampling response when no sampling request is pending |
JSON-RPC |
No matching sampling request found for this session. |
|
Client sends sampling response with an |
JSON-RPC |
The |
|
MCP server sends sampling request but gateway did not declare support |
JSON-RPC |
Returned to the MCP server target. See Troubleshooting. |
Troubleshooting
Error: "Error calling tool 'sample_tool': Method not found: sampling/createMessage"
This error occurs when an MCP server target sends a sampling request but the gateway’s client did not declare sampling capability during initialize. The gateway returns a -32601 (Method not found) error to the target, and the target may return this as a tool execution error to the client.
To resolve:
-
If you are the MCP server developer: Add error handling around your sampling calls. Implement a fallback path when sampling is not supported:
Important
You must include
related_request_id=ctx.request_context.request_idin yourcreate_messagecall. This is required for the gateway to correctly associate the sampling request with the originating tool call. Without it, sampling will not work.try: result = await ctx.session.create_message( messages=[{"role": "user", "content": {"type": "text", "text": "Summarize this document"}}], max_tokens=500, related_request_id=ctx.request_context.request_id, ) except Exception as e: # Fallback when client doesn't support sampling logger.warning(f"Sampling not supported: {e}") result = fallback_summarization(document) -
If you are the gateway client developer: Ensure your client declares sampling capability during
initialize:{ "capabilities": { "sampling": {} } }
Code samples
Note
The LangGraph MCP Client (langchain-mcp-adapters) and Strands MCP Client do not currently support sampling. Use the MCP Client approach shown below to handle sampling requests from your gateway.