InferenceConfiguration
Specifications about the inference parameters that were provided alongside the prompt. These are specified in the PromptOverrideConfiguration object that was set when the agent was created or updated. For more information, see Inference parameters for foundation models.
Contents
- maximumLength
-
The maximum number of tokens allowed in the generated response.
Type: Integer
Valid Range: Minimum value of 0. Maximum value of 4096.
Required: No
- stopSequences
-
A list of stop sequences. A stop sequence is a sequence of characters that causes the model to stop generating the response.
Type: Array of strings
Array Members: Minimum number of 0 items. Maximum number of 4 items.
Required: No
- temperature
-
The likelihood of the model selecting higher-probability options while generating a response. A lower value makes the model more likely to choose higher-probability options, while a higher value makes the model more likely to choose lower-probability options.
Type: Float
Valid Range: Minimum value of 0. Maximum value of 1.
Required: No
- topK
-
While generating a response, the model determines the probability of the following token at each point of generation. The value that you set for
topK
is the number of most-likely candidates from which the model chooses the next token in the sequence. For example, if you settopK
to 50, the model selects the next token from among the top 50 most likely choices.Type: Integer
Valid Range: Minimum value of 0. Maximum value of 500.
Required: No
- topP
-
While generating a response, the model determines the probability of the following token at each point of generation. The value that you set for
Top P
determines the number of most-likely candidates from which the model chooses the next token in the sequence. For example, if you settopP
to 0.8, the model only selects the next token from the top 80% of the probability distribution of next tokens.Type: Float
Valid Range: Minimum value of 0. Maximum value of 1.
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: