Submit prompts and generate responses with model inference

Focus mode

Submit prompts and generate responses with model inference - Amazon Bedrock

Inference refers to the process of generating an output from an input provided to a model.

Amazon Bedrock offers a suite of foundation models that you can use to generate outputs of the following modalities. To see modality support by foundation model, refer to Supported foundation models in Amazon Bedrock.

Output modality	Description	Example use cases
Text	Provide text input and generate various types of text	Chat, question-and-answering, brainstorming, summarization, code generation, table creation, data formatting, rewriting
Image	Provide text or input images and generate or modify images	Image generation, image editing, image variation
Video	Provide text or reference images and generate a video	Video generation, image conversion to video
Embeddings	Provide text, images, or both text and images and generate a vector of numeric values that represent the input. The output vector can be compared to other embeddings vectors to determine semantic similarity (for text) or visual similarity (for images).	Text and image search, query, categorization, recommendations, personalization, knowledge base creation

You can directly run model inference in the following ways:

In the AWS Management Console, use any of the Amazon Bedrock Playgrounds to run inference in a user-friendly graphical interface.
Use the Converse or ConverseStream API to implement conversational applications.
Use the InvokeModel or InvokeModelWithResponseStream API to submit a single prompt.
Prepare a dataset of prompts with your desired configurations and run batch inference with a CreateModelInvocationJob request.

The following Amazon Bedrock features also use model inference as a step in a larger workflow:

Model evaluation uses the model invocation process to evaluate the performance of different models after you submit a CreateEvaluationJob request.
Knowledge bases use model invocation when using the RetrieveAndGenerate API to generate a response based on results retrieved from a knowledge base.
Agents use model invocation to generate responses in various stages during an InvokeAgent request.
Flows include Amazon Bedrock resources, such as prompts, knowledge bases, and agents, which use model invocation.

After testing out different foundation models with different prompts and inference parameters, you can configure your application to call these APIs with your desired specifications.