AsyncInferenceClientConfig
Configures the behavior of the client used by SageMaker to interact with the model container during asynchronous inference.
Contents
- MaxConcurrentInvocationsPerInstance
- 
               The maximum number of concurrent requests sent by the SageMaker client to the model container. If no value is provided, SageMaker chooses an optimal value. Type: Integer Valid Range: Minimum value of 1. Maximum value of 1000. Required: No 
See Also
For more information about using this API in one of the language-specific AWS SDKs, see the following: