Mistral AI chat completion - Amazon Bedrock

Mistral AI chat completion

The Mistral AI chat completion API lets create conversational applications.

Tip

You can use the Mistral AI chat completion API with the base inference operations (InvokeModel or InvokeModelWithResponseStream). However, we recommend that you use the Converse API to implement messages in your application. The Converse API provides a unified set of parameters that work across all models that support messages. For more information, see Carry out a conversation with the Converse API operations.

Mistral AI models are available under the Apache 2.0 license. For more information about using Mistral AI models, see the Mistral AI documentation.

Supported models

You can use following Mistral AI models.

  • Mistral Large

You need the model ID for the model that you want to use. To get the model ID, see Supported foundation models in Amazon Bedrock.

Request and Response

Request

The Mistral AI models have the following inference parameters.

{ "messages": [ { "role": "system"|"user"|"assistant", "content": str }, { "role": "assistant", "content": "", "tool_calls": [ { "id": str, "function": { "name": str, "arguments": str } } ] }, { "role": "tool", "tool_call_id": str, "content": str } ], "tools": [ { "type": "function", "function": { "name": str, "description": str, "parameters": dict } } ], "tool_choice": "auto"|"any"|"none", "max_tokens": int, "top_p": float, "temperature": float }

The following are required parameters.

  • messages – (Required) The messages that you want to pass to the model.

    • role – The role for the message. Valid values are:

      • system – Sets the behavior and context for the model in the conversation.

      • user – The user message to send to the model.

      • assistant – The response from the model.

    • content – The content for the message.

    [ { "role": "user", "content": "What is the most popular song on WZPZ?" } ]

    To pass a tool result, use JSON with the following fields.

    • role – The role for the message. The value must be tool.

    • tool_call_id – The ID of the tool request. You get the ID from the tool_calls fields in the response from the previous request.

    • content – The result from the tool.

    The following example is the result from a tool that gets the most popular song on a radio station.

    { "role": "tool", "tool_call_id": "v6RMMiRlT7ygYkT4uULjtg", "content": "{\"song\": \"Elemental Hotel\", \"artist\": \"8 Storey Hike\"}" }

The following are optional parameters.

  • tools – Definitions of tools that the model may use.

    If you include tools in your request, the model may return a tool_calls field in the message that represent the model's use of those tools. You can then run those tools using the tool input generated by the model and then optionally return results back to the model using tool_result content blocks.

    The following example is for a tool that gets the most popular songs on a radio station.

    [ { "type": "function", "function": { "name": "top_song", "description": "Get the most popular song played on a radio station.", "parameters": { "type": "object", "properties": { "sign": { "type": "string", "description": "The call sign for the radio station for which you want the most popular song. Example calls signs are WZPZ and WKRP." } }, "required": [ "sign" ] } } } ]
  • tool_choice – Specifies how functions are called. If set to none the model won't call a function and will generate a message instead. If set to auto the model can choose to either generate a message or call a function. If set to any the model is forced to call a function.

  • max_tokens – Specify the maximum number of tokens to use in the generated response. The model truncates the response once the generated text exceeds max_tokens.

    Default Minimum Maximum

    Mistral Large – 8,192

    1

    Mistral Large – 8,192

  • temperature – Controls the randomness of predictions made by the model. For more information, see Influence response generation with inference parameters.

    Default Minimum Maximum

    Mistral Large – 0.7

    0

    1

  • top_p – Controls the diversity of text that the model generates by setting the percentage of most-likely candidates that the model considers for the next token. For more information, see Influence response generation with inference parameters.

    Default Minimum Maximum

    Mistral Large – 1

    0

    1

Response

The body response from a call to InvokeModel is the following:

{ "choices": [ { "index": 0, "message": { "role": "assistant", "content": str, "tool_calls": [...] }, "stop_reason": "stop"|"length"|"tool_calls" } ] }

The body response has the following fields:

  • choices – The output from the model. fields.

    • index – The index for the message.

    • message – The message from the model.

      • role – The role for the message.

      • content – The content for the message.

      • tool_calls – If the value of stop_reason is tool_calls, this field contains a list of tool requests that the model wants you to run.

        • id – The ID for the tool request.

        • function – The function that the model is requesting.

          • name – The name of the function.

          • arguments – The arguments to pass to the tool

        The following is an example request for a tool that gets the top song on a radio station.

        [ { "id": "v6RMMiRlT7ygYkT4uULjtg", "function": { "name": "top_song", "arguments": "{\"sign\": \"WZPZ\"}" } } ]
    • stop_reason – The reason why the response stopped generating text. Possible values are:

      • stop – The model has finished generating text for the input prompt. The model stops because it has no more content to generate or if the model generates one of the stop sequences that you define in the stop request parameter.

      • length – The length of the tokens for the generated text exceeds the value of max_tokens. The response is truncated to max_tokens tokens.

      • tool_calls – The model is requesting that you run a tool.