Supported architectures Import a model source from Amazon S3

Use Custom model import to import a customized open-source model into Amazon Bedrock

You can create a custom model in Amazon Bedrock by using the Amazon Bedrock Custom Model Import feature to import Foundation Models that you have customized in other environments, such as Amazon SageMaker AI. For example, you might have a model that you have created in Amazon SageMaker AI that has proprietary model weights. You can now import that model into Amazon Bedrock and then leverage Amazon Bedrock features to make inference calls to the model.

You can use a model that you import with on demand throughput. Use the InvokeModel or InvokeModelWithResponseStream operations to make inference calls to the model. For more information, see Submit a single prompt with InvokeModel.

Amazon Bedrock Custom Model Import is supported in the following Regions (for more information about Regions supported in Amazon Bedrock see Amazon Bedrock endpoints and quotas):

US East (N. Virginia)
US East (Ohio)
US West (Oregon)
Europe (Frankfurt)

Note

Make sure that your import and use of the models in Amazon Bedrock complies with the terms or licenses applicable to the models.

You can't use Custom Model Import with the following Amazon Bedrock features.

Batch inference
AWS CloudFormation

With Custom Model Import you can create a custom model that supports the following patterns.

Fine-tuned or Continued Pre-training model — You can customize the model weights using proprietary data, but retain the configuration of the base model.
Adaptation You can customize the model to your domain for use cases where the model doesn't generalize well. Domain adaptation modifies a model to generalize for a target domain and deal with discrepancies across domains, such as a financial industry wanting to create a model which generalizes well on pricing. Another example is language adaptation. For example you could customize a model to generate responses in Portuguese or Tamil. Most often, this involves changes to the vocabulary of the model that you are using.
Pretrained from scratch — In addition to customizing the weights and vocabulary of the model, you can also change model configuration parameters such as the number of attention heads, hidden layers, or context length.

For information regarding pricing for custom model import, select the Custom Model Import tab in the Model pricing details section of Amazon Bedrock pricing.

Topics

Supported architectures

The model you import must be in one of the following architectures.

Mistral — A decoder-only Transformer based architecture with Sliding Window Attention (SWA) and options for Grouped Query Attention (GQA). For more information, see Mistral in the Hugging Face documentation.
Mixtral — A decoder-only transformer model with sparse Mixture of Experts (MoE) models. For more information, see Mixtral in the Hugging Face documentation.
Flan — An enhanced version of the T5 architecture, an encoder-decoder based transformer model. For more information, see Flan T5 in the Hugging Face documentation.
Llama 2, Llama3, Llama3.1, Llama3.2, Llama 3.3, and Mllama — An improved version of Llama with Grouped Query Attention (GQA). For more information, see Llama 2, Llama 3, Llama 3.1, Llama 3.2, Llama 3.3, and Mllama in the Hugging Face documentation.
GPTBigCode — An optimized version of GPT-2 with Multi-Query action. For more information, see GPTBigCode in the Hugging Face documentation.
Qwen2, Qwen2.5, Qwen2-VL, Qwen2.5-VL, Qwen3 — An LLM family with comprehensive multimodal perception and high speed vision encoding. Any model using the Qwen2, Qwen2-VL, and Qwen2.5-VL architectures can be imported. For Qwen3 architecture, only Qwen3ForCausalLM and Qwen3MoeForCausalLM are supported. Converse is also not supported for Qwen3 models. For more information, see Qwen2, Qwen2.5, Qwen2-VL, Qwen2.5-VL, and Qwen3 in the Hugging Face documentation.

Note

The size of the imported model weights must be less than 100GB for multimodal models and 200GB for text models.
The maximum positional embeddings or the maximum context length supported by the model should be less than 128K.
Amazon Bedrock supports transformer version 4.51.3. Ensure that you are using transformer version 4.51.3 when you fine tune your model.
Custom Model Import does not support embedding models.

Import a model source from Amazon S3

You import a model into Amazon Bedrock by creating a model import job in the Amazon Bedrock console or API. In the job you specify the Amazon S3 URI for the source of the model files. During model training, the import job automatically detects your model's architecture.

You need to supply the model files in the Hugging Face weights format. You can create the files by using the Hugging Face transformer library. To create model files for a Llama model, see convert_llama_weights_to_hf.py. To create the files for a Mistral AI model, see convert_mistral_weights_to_hf.py.

To import the model from Amazon S3, you minimally need the following files that the Hugging Face transformer library creates.

.safetensor — the model weights in Safetensor format. Safetensors is a format created by Hugging Face that stores a model weights as tensors. You must store the tensors for your model in a file with the extension .safetensors. For more information, see Safetensors. For information about converting model weights to Safetensor format, see Convert weights to safetensors.
Note
- Currently, Amazon Bedrock only supports model weights with FP32, FP16, and BF16 precision. Amazon Bedrock will reject model weights if you supply them with any other precision. Internally Amazon Bedrock will convert FP32 models to BF16 precision.
- Amazon Bedrock doesn't support the import of quantized models.
config.json — For examples, see LlamaConfig and MistralConfig.
Note
Amazon Bedrock overrides llama3 rope_scaling value with the following values:
- original_max_position_embeddings=8192
- high_freq_factor=4
- low_freq_factor=1
- factor=8
tokenizer_config.json For an example, see LlamaTokenizer.
tokenizer.json
tokenizer.model

Supported tokenizers

Amazon Bedrock Custom Model Import supports the following tokenizers. You can use these tokenizers with any model.

T5Tokenizer
T5TokenizerFast
LlamaTokenizer
LlamaTokenizerFast
CodeLlamaTokenizer
CodeLlamaTokenizerFast
GPT2Tokenizer
GPT2TokenizerFast
GPTNeoXTokenizer
GPTNeoXTokenizerFast
PreTrainedTokenizer
PreTrainedTokenizerFast
Qwen2Tokenizer
Qwen2TokenizerFast

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Model import: Import a pre-trained model

Prerequisites for importing model