Anthropic Claude Text Completions API

Focus mode

Anthropic Claude Text Completions API - Amazon Bedrock

Anthropic Claude Text Completions API overview Supported models Request and Response Code example

This section provides inference parameters and code examples for using Anthropic Claude models with the Text Completions API.

Topics

Anthropic Claude Text Completions API overview
Supported models
Request and Response
Code example

Anthropic Claude Text Completions API overview

Use the Text Completion API for single-turn text generation from a user supplied prompt. For example, you can use the Text Completion API to generate text for a blog post or to summarize text input from a user.

For information about creating prompts for Anthropic Claude models, see Introduction to prompt design. If you want to use your existing Text Completions prompts with the Anthropic Claude Messages API, see Migrating from Text Completions.

Supported models

You can use the Text Completions API with the following Anthropic Claude models.

Anthropic Claude Instant v1.2
Anthropic Claude v2
Anthropic Claude v2.1

Request and Response

The request body is passed in the body field of a request to InvokeModel or InvokeModelWithResponseStream.

For more information, see https://docs.anthropic.com/claude/reference/complete_post in the Anthropic Claude documentation.

Request

Anthropic Claude has the following inference parameters for a Text Completion inference call.


{
    "prompt": "\n\nHuman:<prompt>\n\nAssistant:",
    "temperature": float,
    "top_p": float,
    "top_k": int,
    "max_tokens_to_sample": int,
    "stop_sequences": [string]
}

The following are required parameters.

prompt – (Required) The prompt that you want Claude to complete. For proper response generation you need to format your prompt using alternating \n\nHuman: and \n\nAssistant: conversational turns. For example:
```
"\n\nHuman: {userQuestion}\n\nAssistant:"
```
For more information, see Prompt validation in the Anthropic Claude documentation.

max_tokens_to_sample – (Required) The maximum number of tokens to generate before stopping. We recommend a limit of 4,000 tokens for optimal performance.

Note that Anthropic Claude models might stop generating tokens before reaching the value of max_tokens_to_sample. Different Anthropic Claude models have different maximum values for this parameter. For more information, see Model comparison in the Anthropic Claude documentation.

Default	Minimum	Maximum
200	0	4096

The following are optional parameters.

stop_sequences – (Optional) Sequences that will cause the model to stop generating.

Anthropic Claude models stop on "\n\nHuman:", and may include additional built-in stop sequences in the future. Use the stop_sequences inference parameter to include additional strings that will signal the model to stop generating text.

temperature – (Optional) The amount of randomness injected into the response. Use a value closer to 0 for analytical / multiple choice, and a value closer to 1 for creative and generative tasks.

Default	Minimum	Maximum
1	0	1

top_p – (Optional) Use nucleus sampling.

In nucleus sampling, Anthropic Claude computes the cumulative distribution over all the options for each subsequent token in decreasing probability order and cuts it off once it reaches a particular probability specified by top_p. You should alter either temperature or top_p, but not both.

Default	Minimum	Maximum
1	0	1

top_k – (Optional) Only sample from the top K options for each subsequent token.

Use top_k to remove long tail low probability responses.

Default	Minimum	Maximum
250	0	500

Response

The Anthropic Claude model returns the following fields for a Text Completion inference call.


{
    "completion": string,
    "stop_reason": string,
    "stop": string
}

completion – The resulting completion up to and excluding the stop sequences.
stop_reason – The reason why the model stopped generating the response.
- "stop_sequence" – The model reached a stop sequence — either provided by you with the stop_sequences inference parameter, or a stop sequence built into the model.
- "max_tokens" – The model exceeded max_tokens_to_sample or the model's maximum number of tokens.
stop – If you specify the stop_sequences inference parameter, stop contains the stop sequence that signalled the model to stop generating text. For example, holes in the following response.
```
{
    "completion": " Here is a simple explanation of black ",
    "stop_reason": "stop_sequence",
    "stop": "holes"
}
```
If you don't specify stop_sequences, the value for stop is empty.

anchor anchor

Anthropic Claude has the following inference parameters for a Text Completion inference call.


{
    "prompt": "\n\nHuman:<prompt>\n\nAssistant:",
    "temperature": float,
    "top_p": float,
    "top_k": int,
    "max_tokens_to_sample": int,
    "stop_sequences": [string]
}

The following are required parameters.

prompt – (Required) The prompt that you want Claude to complete. For proper response generation you need to format your prompt using alternating \n\nHuman: and \n\nAssistant: conversational turns. For example:
```
"\n\nHuman: {userQuestion}\n\nAssistant:"
```
For more information, see Prompt validation in the Anthropic Claude documentation.

max_tokens_to_sample – (Required) The maximum number of tokens to generate before stopping. We recommend a limit of 4,000 tokens for optimal performance.

Default	Minimum	Maximum
200	0	4096

The following are optional parameters.

stop_sequences – (Optional) Sequences that will cause the model to stop generating.

Anthropic Claude models stop on "\n\nHuman:", and may include additional built-in stop sequences in the future. Use the stop_sequences inference parameter to include additional strings that will signal the model to stop generating text.

Default	Minimum	Maximum
1	0	1

top_p – (Optional) Use nucleus sampling.

Default	Minimum	Maximum
1	0	1

top_k – (Optional) Only sample from the top K options for each subsequent token.

Use top_k to remove long tail low probability responses.

Default	Minimum	Maximum
250	0	500

Code example

These examples shows how to call the Anthropic Claude V2 model with on demand throughput. To use Anthropic Claude version 2.1, change the value of modelId to anthropic.claude-v2:1.


import boto3
import json
brt = boto3.client(service_name='bedrock-runtime')

body = json.dumps({
    "prompt": "\n\nHuman: explain black holes to 8th graders\n\nAssistant:",
    "max_tokens_to_sample": 300,
    "temperature": 0.1,
    "top_p": 0.9,
})

modelId = 'anthropic.claude-v2'
accept = 'application/json'
contentType = 'application/json'

response = brt.invoke_model(body=body, modelId=modelId, accept=accept, contentType=contentType)

response_body = json.loads(response.get('body').read())

# text
print(response_body.get('completion'))

The following example shows how to generate streaming text with Python using the prompt write an essay for living on mars in 1000 words and the Anthropic Claude V2 model:


import boto3
import json

brt = boto3.client(service_name='bedrock-runtime')

body = json.dumps({
    'prompt': '\n\nHuman: write an essay for living on mars in 1000 words\n\nAssistant:',
    'max_tokens_to_sample': 4000
})
                   
response = brt.invoke_model_with_response_stream(
    modelId='anthropic.claude-v2', 
    body=body
)
    
stream = response.get('body')
if stream:
    for event in stream:
        chunk = event.get('chunk')
        if chunk:
            print(json.loads(chunk.get('bytes').decode()))

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

*NEW* Anthropic Claude 3.7 Sonnet

Anthropic Claude Messages API

Next topic:

Anthropic Claude Messages API

Previous topic:

*NEW* Anthropic Claude 3.7 Sonnet

Need help?

Select your cookie preferences

Customize cookie preferences

Essential

Performance

Functional

Advertising

Unable to save cookie preferences