TextInferenceConfig
The configuration details for text generation using a language model via the
RetrieveAndGenerate
function.
Contents
- maxTokens
-
The maximum number of tokens to generate in the output text. Do not use the minimum of 0 or the maximum of 65536. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
Type: Integer
Valid Range: Minimum value of 0. Maximum value of 65536.
Required: No
- stopSequences
-
A list of sequences of characters that, if generated, will cause the model to stop generating further tokens. Do not use a minimum length of 1 or a maximum length of 1000. The limit values described here are arbitrary values, for actual values consult the limits defined by your specific model.
Type: Array of strings
Array Members: Minimum number of 0 items. Maximum number of 4 items.
Length Constraints: Minimum length of 1. Maximum length of 1000.
Required: No
- temperature
-
Controls the random-ness of text generated by the language model, influencing how much the model sticks to the most predictable next words versus exploring more surprising options. A lower temperature value (e.g. 0.2 or 0.3) makes model outputs more deterministic or predictable, while a higher temperature (e.g. 0.8 or 0.9) makes the outputs more creative or unpredictable.
Type: Float
Valid Range: Minimum value of 0. Maximum value of 1.
Required: No
- topP
-
A probability distribution threshold which controls what the model considers for the set of possible next tokens. The model will only consider the top p% of the probability distribution when generating the next token.
Type: Float
Valid Range: Minimum value of 0. Maximum value of 1.
Required: No
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: