TwelveLabs models

This section describes the request parameters and response fields for TwelveLabs models. Use this information to make inference calls to TwelveLabs models. The TwelveLabs Pegasus 1.2 model supports InvokeModel and InvokeModelWithResponseStream (streaming) operations, while the TwelveLabs Marengo Embed 2.7 and TwelveLabs Marengo Embed 3.0 models support StartAsyncInvoke operations. This section also includes code examples that show how to call TwelveLabs models. To use a model in an inference operation, you need the model ID for the model. To get the model ID, see Supported foundation models in Amazon Bedrock.

TwelveLabs is a leading provider of multimodal AI models specializing in video understanding and analysis. Their advanced models enable sophisticated video search, analysis, and content generation capabilities through state-of-the-art computer vision and natural language processing technologies. Amazon Bedrock now offers three TwelveLabs models: TwelveLabs Pegasus 1.2, which provides comprehensive video understanding and analysis, TwelveLabs Marengo Embed 2.7, which generates high-quality embeddings for video, text, audio, and image content, and TwelveLabs Marengo Embed 3.0, the latest embedding model with enhanced performance and capabilities. These models empower developers to build applications that can intelligently process, analyze, and derive insights from video data at scale.

TwelveLabs Pegasus 1.2

A multimodal model that provides comprehensive video understanding and analysis capabilities, including content recognition, scene detection, and contextual understanding. The model can analyze video content and generate textual descriptions, insights, and answers to questions about the video.

TwelveLabs Marengo Embed 2.7

A multimodal embedding model that generates high-quality vector representations of video, text, audio, and image content for similarity search, clustering, and other machine learning tasks. The model supports multiple input modalities and provides specialized embeddings optimized for different use cases.

TwelveLabs Marengo Embed 3.0

An enhanced multimodal embedding model that extends the capabilities of Marengo 2.7 with support for text and image interleaved input modality. This model generates high-quality vector representations of video, text, audio, image, and interleaved text-image content for similarity search, clustering, and other machine learning tasks.

Topics

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Stability AI Image Services

TwelveLabs Pegasus 1.2