Amazon Titan Image Generator G1 models - Amazon Bedrock

Amazon Titan Image Generator G1 models

Amazon Titan Image Generator G1 is an image generation model. It comes in two versions v1 and v2.

Amazon Titan Image Generator v1 enables users to generate and edit images in versatile ways. Users can create images that match their text-based descriptions by simply inputting natural language prompts. Furthermore, they can upload and edit existing images, including applying text-based prompts without the need for a mask, or editing specific parts of an image using an image mask. The model also supports outpainting, which extends the boundaries of an image, and inpainting, which fills in missing areas. It offers the ability to generate variations of an image based on an optional text prompt, as well as instant customization options that allow users to transfer styles using reference images or combine styles from multiple references, all without requiring any fine-tuning.

Titan Image Generator v2 supports all the existing features of Titan Image Generator v1 and adds several new capabilities. It allows users to leverage reference images to guide image generation, where the output image aligns with the layout and composition of the reference image while still following the textual prompt. It also includes an automatic background removal feature, which can remove backgrounds from images containing multiple objects without any user input. The model provides precise control over the color palette of generated images, allowing users to preserve a brand's visual identity without the requirement for additional fine-tuning. Additionally, the subject consistency feature enables users to fine-tune the model with reference images to preserve the chosen subject (e.g., pet, shoe or handbag) in generated images. This comprehensive suite of features empowers users to unleash their creative potential and bring their imaginative visions to life.

For more information on Amazon Titan Image Generator G1 models prompt engineering guidelines, see Amazon Titan Image Generator Prompt Engineering Best Practices.

To continue supporting best practices in the responsible use of AI, Titan Foundation Models (FMs) are built to detect and remove harmful content in the data, reject inappropriate content in the user input, and filter the models’ outputs that contain inappropriate content (such as hate speech, profanity, and violence). The Titan Image Generator FM adds an invisible watermark and C2PA metadata to all generated images.

You can use the watermark detection feature in Amazon Bedrock console or call Amazon Bedrock watermark detection API (preview) to check whether an image contains watermark from Titan Image Generator. You can also use sites like Content Credentials Verify to check if an image was generated by Titan Image Generator.

Amazon Titan Image Generator v1 overview

  • Model IDamazon.titan-image-generator-v1

  • Max input characters – 512 char

  • Max input image size – 5 MB (only some specific resolutions are supported)

  • Max image size using in/outpainting – 1,408 x 1,408 px px

  • Max image size using image variation – 4,096 x 4,096 px

  • Languages – English

  • Output type – image

  • Supported image types – JPEG, JPG, PNG

  • Inference types – On-Demand, Provisioned Throughput

  • Supported use cases – image generation, image editing, image variations

Amazon Titan Image Generator v2 overview

  • Model IDamazon.titan-image-generator-v2:0

  • Max input characters – 512 char

  • Max input image size – 5 MB (only some specific resolutions are supported)

  • Max image size using in/outpainting, background removal, image conditioning, color palette – 1,408 x 1,408 px

  • Max image size using image variation – 4,096 x 4,096 px

  • Languages – English

  • Output type – image

  • Supported image types – JPEG, JPG, PNG

  • Inference types – On-Demand, Provisioned Throughput

  • Supported use cases – image generation, image editing, image variations, background removal, color guided content

Features

  • Text-to-image (T2I) generation – Input a text prompt and generate a new image as output. The generated image captures the concepts described by the text prompt.

  • Finetuning of a T2I model – Import several images to capture your own style and personalization and then fine tune the core T2I model. The fine-tuned model generates images that follow the style and personalization of a specific user.

  • Image editing options – include: inpainting, outpainting, generating variations, and automatic editing without an image mask.

  • Inpainting – Uses an image and a segmentation mask as input (either from the user or estimated by the model) and reconstructs the region within the mask. Use inpainting to remove masked elements and replace them with background pixels.

  • Outpainting – Uses an image and a segmentation mask as input (either from the user or estimated by the model) and generates new pixels that seamlessly extend the region. Use precise outpainting to preserve the pixels of the masked image when extending the image to the boundaries. Use default outpainting to extend the pixels of the masked image to the image boundaries based on segmentation settings.

  • Image variation – Uses 1 to 5 images and an optional prompt as input. It generates a new image that preserves the content of the input image(s), but variates its style and background.

  • Image conditioning – (V2 only) Uses an input reference image to guide image generation. The model generates output image that aligns with the layout and the composition of the reference image, while still following the textual prompt.

  • Subject consistency – (V2 only) Subject consistency allows users to fine-tune the model with reference images to preserve the chosen subject (for example, pet, shoe, or handbag) in generated images.

  • Color guided content – (V2 only) You can provide a list of hex color codes along with a prompt. A range of 1 to 10 hex codes can be provided. The image returned by Titan Image Generator G1 V2 will incorporate the color palette provided by the user.

  • Background removal – (V2 only) Automatically identifies multiple objects in the input image and removes the background. The output image has a transparent background.

  • Content provenance – Use sites like Content Credentials Verify to check if an image was generated by Titan Image Generator. This should indicate the image was generated unless the metadata has been removed.

Note

if you are using a fine-tuned model, you cannot use inpainting, outpainting or color palette features of the API or the model.

Parameters

For information on Amazon Amazon Titan Image Generator G1 models inference parameters, see Amazon Titan Image Generator G1 models inference parameters.

Fine-tuning

For more information on fine-tuning the Amazon Titan Image Generator G1 models, see the following pages.

Amazon Titan Image Generator G1 models fine-tuning and pricing

The model uses the following example formula to calculate the total price per job:

Total Price = Steps * Batch size * Price per image seen

Minimum values (auto):

  • Minimum steps (auto) - 500

  • Minimum batch size - 8

  • Default learning rate - 0.00001

  • Price per image seen - 0.005

Fine-tuning hyperparameter settings

Steps – The number of times the model is exposed to each batch. There is no default step count set. You must select a number between 10 - 40,000, or a String value of "Auto."

Step settings - Auto – Amazon Bedrock determines a reasonable value based on training information. Select this option to prioritize model performance over training cost. The number of steps is determined automatically. This number will typically be between 1,000 and 8,000 based on your dataset. Job costs are impacted by the number of steps used to expose the model to the data. Refer to the pricing examples section of pricing details to understand how job cost is calculated. (See example table above to see how step count is related to number of images when Auto is selected.)

Step settings - Custom – You can enter the number of steps you want Bedrock to expose your custom model to the training data. This value can be between 10 and 40,000. You can reduce the cost per image produced by the model by using a lower step count value.

Batch size – The number of sample processed before model parameters are updated. This value is between 8 and 192 and is a multiple of 8.

Learning rate – The rate at which model parameters are updated after each batch of training data. This is a float value between 0 and 1. The learning rate is set to 0.00001 by default.

For more information on the fine-tuning procedure, see Submit a model customization job.

Output

Amazon Titan Image Generator G1 models use the output image size and quality to determine how an image is priced. Amazon Titan Image Generator G1 models have two pricing segments based on size: one for 512*512 images and another for 1024*1024 images. Pricing is based on image size height*width, less than or equal to 512*512 or greater than 512*512.

For more information on Amazon Bedrock pricing, see Amazon Bedrock Pricing.

Watermark detection

Note

Watermark detection for the Amazon Bedrock console and API is available in public preview release and will only detect a watermark generated from Titan Image Generator G1. This feature is currently only available in the us-west-2 and us-east-1 regions. Watermark detection is a highly accurate detection of the watermark generated by Titan Image Generator G1. Images that are modified from the original image may produce less accurate detection results.

This model adds an invisible watermark to all generated images to reduce the spread of misinformation, assist with copyright protection, and track content usage. A watermark detection is available to help you confirm whether an image was generated by the Titan Image Generator G1 model, which checks for the existence of this watermark.

Note

Watermark Detection API is in preview and is subject to change. We recommend that you create a virtual environment to use the SDK. Because watermark detection APIs aren't available in the latest SDKs, we recommend that you uninstall the latest version of the SDK from the virtual environment before installing the version with the watermark detection APIs.

You can upload your image to detect if a watermark from Titan Image Generator G1 is present on the image. Use the console to detect a watermark from this model by following the below steps.

To detect a watermark with Titan Image Generator G1:
  1. Open the Amazon Bedrock console at Amazon Bedrock console

  2. Select Overview from the navigation pane in Amazon Bedrock. Choose the Build and Test tab.

  3. In the Safeguards section, go to Watermark detection and choose View watermark detection.

  4. Select Upload image and locate a file that is in JPG or PNG format. The maximum file size allowed is 5 MB.

  5. Once uploaded, a thumbnail of image is shown with the name, file size, and the last date modified. Select X to delete or replace image from the Upload section.

  6. Select Analyze to begin watermark detection analysis.

  7. The image is previewed under Results, and indicates if a watermark is detected with Watermark detected below the image and a banner across the image. If no watermark is detected, the text below the image will say Watermark NOT detected.

  8. To load the next image, select X in the thumbnail of the image in the Upload section and choose a new image to analyze.

Prompt Engineering Guidelines

Mask prompt – This algorithm classifies pixels into concepts. The user can give a text prompt that will be used to classify the areas of the image to mask, based on the interpretation of the mask prompt. The prompt option can interpret more complex prompts, and encode the mask into the segmentation algorithm.

Image mask – You can also use an image mask to set the mask values. The image mask can be combined with prompt input for the mask to improve accuracy. The image mask file must conform to the following parameters:

  • Mask image values must be 0 (black) or 255 (white) for the mask image. The image mask area with the value of 0 will be regenerated with the image from the user prompt and/or input image.

  • The maskImage field must be a base64 encoded image string.

  • Mask image must have the same dimensions as the input image (same height and width).

  • Only PNG or JPG files can be used for the input image and the mask image.

  • Mask image must only use black and white pixels values.

  • Mask image can only use the RGB channels (alpha channel not supported).

For more information on Amazon Titan Image Generator prompt engineering, see Amazon Titan Image Generator G1 models Prompt Engineering Best Practices.

For general prompt engineering guidelines, see Prompt Engineering Guidelines.