Cost
With this AWS Solution, you pay only for the resources you use and there are no minimum fees or setup charges. Users pay for the dashboard used to launch Generative AI use cases and, and for any use cases that are deployed. The cost of deployed use cases depends on the configurations. Example configurations:
-
A simple Deployment dashboard which costs approximately $20 USD per month.
-
A simple production-ready chatbot use case deployed with default settings running in US East (N. Virginia), powered by Amazon Bedrock without access to documents, which also costs around $200 USD per month.
-
A scaled system in an Amazon VPC use case that supports 8,000 queries per day over tens of thousands of documents, which costs around $1,400 USD per month. The cost of the use case will vary depending on the configuration, such as Text use cases with different model providers, with or without Retrieval Augmented Generation (RAG) enabled, and so on.
Workload description | Estimated cost (USD/month) |
---|---|
$20/month | |
Sample costs for a text-based proof of concept (includes Deployment dashboard and 1 Text use case, ~100 interactions per day) |
$40/month |
Sample costs for a highly scalable generative AI query engine (Includes Deployment dashboard, 1 Text use case, and an Amazon Kendra Index for RAG up to 100K documents with ~8K queries per day, with VPC enabled |
$1,400/month |
Sample costs for an agent-based proof of concept (Includes Deployment dashboard, 1 Agent use case with Amazon Bedrock Knowledge Bases and Amazon Bedrock Guardrails enabled, ~100 interactions per day) |
$840/month |
Important
These examples are only intended to help you estimate the costs for your specific workloads. The use of different LLMs, configurations, or AWS services can change your costs
(example, serverless/on-demand billing vs. provisioned/time-billed). To manage costs, we recommend creating a budget through
AWS Cost Explorer
Sample costs for running the Deployment dashboard
The following table provides the cost breakdown for a Deployment dashboard with default parameters and 100 active users in the US East (N. Virginia) Region for one month, which will cost about $20/month.
AWS service | Dimensions | Cost [USD] |
---|---|---|
API Gateway, DynamoDB, CloudFront, Amazon S3, Lambda, Systems Manager Parameter Store | 5,000 512 KB REST API calls per month without caching enabled | $1.97 |
Amazon Cognito | 100 active users per month with advanced security features enabled and no users signing in through SAML or OIDC federation | $5.55 |
AWS WAF | 10,000 web requests across 1 web ACL and 7 defined rules without any rule groups | $12.60 |
Total Deployment dashboard cost | $20.12 |
Sample costs for a text-based proof of concept
A Deployment dashboard can have many use cases deployed at a given time. The following table shows the cost breakdown of a use case deployed without RAG for 1 business user performing 100 queries per day with the LLM. Queries are sent as a text message on the WebSocket and the response is streamed back as tokens with the assumption that streaming is enabled. With an Amazon Bedrock Titan Text Express model, the cost of running this use case is about $15/month.
AWS service | Dimensions | Cost [USD] |
---|---|---|
API Gateway (WebSocket), CloudFront, Lambda, Amazon S3, AWS Systems Manager Parameter Store | 100 chat interactions per day. Average message size 32 KB per message and 5 minutes per connection. | $0.61 |
CloudWatch | 1.5 GB CloudWatch logs with verbose mode on for experimentation | $7.23 |
Amazon DynamoDB |
Conversation history table, 1 GB storage LLM configuration table, 1 GB storage |
$3.05 |
Subtotal of the use case costs (not including LLMs) | $10.89 | |
Amazon Bedrock (Titan Text Express) |
Assumptions for 100 interactions per day:
|
$1.50 |
Total application cost with Amazon Bedrock (Titan Text Express) | $10.89 (Use Case cost) + $1.50 (Amazon Bedrock cost) | $12.39 |
Note
The costs of inference calls made to services outside the AWS network are not included in these estimates. Refer to the pricing guide of your LLM provider if you’re not using an AWS model provider.
Pricing guides for AWS services can be found at:
Amazon Bedrock pricing
Sample costs for a highly scalable generative AI query engine
The following table provides the cost breakdown of a RAG-enabled use case with a Kendra index supporting 8000 interactions/day. With Amazon Bedrock’s Titan Text Express model as the LLM, this use case costs about $1200/month
AWS service | Dimensions | Cost [USD] |
---|---|---|
API Gateway (WebSocket) | 8000 chat interactions per day. Average message size 32 KB per message and 5 minutes per connection. | $38.89 |
CloudFront | 240,000 requests per month with 100 GB data transferred out to the internet and 1 GB data transferred out to the origin | $8.76 |
Amazon Bedrock (Titan Text Express) |
Assumptions: Input tokens = promptTemplate (400) + context (400)+ chatHistory (1080) + query Input tokens (20)= 1,900 Output tokens = 160 (average) With 8,000 transactions a day, Daily Input Tokens cost (1,900 x 8,000 = 15,200,000 tokens x 0.0002/1000 price per token) Daily Output Tokens cost (160 x 8,000 = 1,280,000 tokens x 0.0006/1000 price per token) Monthly cost (($3.04 + $0.77) x 30) |
$114.30 |
CloudWatch | 24 metrics using 5 GB data ingested for logs and 1 dashboard | $9.72 |
DynamoDB | DynamoDB table to keep track of conversation history with each record up to 1 KB data, 8,000 read and writes per day | $11.70 |
Lambda |
Container size - 128 MB, 512 MB ephemeral storage, 2 Lambda functions used for authorization Container size - 256 MB, 512 MB ephemeral storage, 5 requests per second with 20 seconds average compute time |
$20.89 |
Total use case cost | $204.26/month + knowledge base cost (see below) |
Note
The costs of API calls made to any services outside of the AWS network are not included in these estimates. See the pricing guide of your LLM provider if not using Amazon Bedrock.
Costs for adding a knowledge base
Knowledge base costs will vary based on the type of knowledge base used, and (in the case of Bedrock) the backing vector store used by the knowledge base. Provisioning and managing the knowledge bases is outside of the scope of the solution.
Amazon Kendra
The solution can provision a Kendra index for you, or you can bring your own. The cost for running a configuration suited to the above highly scalable generative AI query engine is as follows:
AWS service | Dimensions | Cost [USD] |
---|---|---|
Amazon Kendra | 0-8,000 queries a day and up to 100,000 documents with Amazon Kendra Enterprise Edition with 0-50 data sources |
$1,008.00 |
Note
You can share the Amazon Kendra index between use cases, but this can drive up the number of queries per index. If this falls outside the Amazon Kendra Enterprise edition, additional charges will apply.
Amazon Bedrock Knowledge Bases
The solution does not manage or provision any resources related to Amazon Bedrock Knowledge
Bases. Amazon Bedrock does not incur cost for using the knowledge base
feature itself, however you will be charged for the usage of the embedding model
used by your use case on each query. Additionally, the backing vector store for your
knowledge base (for example, an index in
Amazon OpenSearch Service
For the above highly scalable generative AI query engine scenario, the costs incurred by this service for calling the Amazon Bedrock embeddings model are as follows:
AWS service | Dimensions | Cost [USD] |
---|---|---|
Amazon Bedrock (Amazon Titan Text Embeddings) |
8,000 queries a day with 1,900 input tokens per query = 15,200,000 tokens = $0.30 USD per day. Daily cost x 30 days = $9.00 USD monthly cost |
$9.00 |
Amazon OpenSearch Service (Serverless) Sample Usage |
Basic serverless configuration with 4 x OpenSearch Compute Unit (OCU) (billable minimum) = $23.04 USD per day Daily cost x 30 days = $691.20 USD NoteThis provides a rough estimate, as some workloads will require more OCUs, while customers with existing provisioned OpenSearch resources will incur less cost here. |
$691.20 |
Total additional cost |
$ 700.20 |
Incremental cost of enabling Amazon VPC for a use case
The following table provides the cost breakdown of enabling Amazon VPC for a use case deployed in two AZs.
AWS service | Dimensions | Cost [USD] |
---|---|---|
Amazon NAT Gateway | Assumption: 2 AZ deployment, with a NAT Gateway in each AZ. 100 GB of data processed through NAT Gateway 730 hours, 100 GB data processed per month | $74.70 |
AWS PrivateLink (VPC Endpoints) |
Assumptions: 2 AZ deployment, with 1 private subnet in each AZ and 1 VPC Endpoint with 2 elastic network interfaces (ENIs). 6 VPC endpoints, 2 ENIs per VPC endpoint, 730 hours with 1,024 GB data processed in a month |
$97.84 |
Public IPv4 address |
Assumption: 2 AZ deployment, 1 public subnet in each AZ with a NAT Gateway in each public subnet. Each NAT Gateway configured with 1 active public IPv4. 2 active public IPv4 address x 730 hours in a month x $0.005 hourly charge = $7.3 USD |
$7.30 |
Additional cost (for Amazon VPC) |
$179.93 |
Cost implications when using Provisioned Throughput
Provisioned throughput costs will vary based on the type of model you've provisioned and your commitment period as well as Model Units selected for the commitment period. There is an additional cost associated with using Provisioned Throughput. As an example, when using Anthropic Claude Instant or Claude 2.x models or Amazon Titan Text Express, your prices per hour would look like:
Anthropic models |
Price per hour per model with no commitment |
Price per hour per model unit for 1-month commitment |
Price per hour per model unit for 6-month commitment |
---|---|---|---|
Claude Instant |
$44.00 |
$39.60 |
$22.00 |
Claude 2.0/2.1 |
$70.00 |
$63.00 |
$35.00 |
Amazon Titan Text Express |
$20.50 |
$18.40 |
$14.80 |
For more information and most up-to-date pricing, you can refer Bedrock Pricing
Cost for using cross-region inference
There is no additional cost for routing or data transfer for using cross-region inference. You pay the same price per token for models as in your source or primary Region.
Sample costs for an agent-based proof of concept
When you use Amazon Bedrock Agents, you’re charged based on the components comprising the agent, such as the backing model and knowledge base (if RAG is enabled), along with additional capabilities that you add. The following table shows the cost breakdown of an Agent use case configured with an on-demand Claude 3.5 Sonnet model, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails.
Similar to the cost for adding Amazon Bedrock Knowledge Bases, this solution doesn’t manage or provision resources related to Amazon Bedrock Agents. The solution also doesn’t incur cost for using Amazon Bedrock Knowledge Bases, but does incur cost for:
-
Using the embedding model for each query that is sent to it
-
The backing vector store for your knowledge base (for example, an index in Amazon OpenSearch Service, or a database inside Amazon RDS)
The following table assumes 100 interactions per day with 1,900 input tokens and 160 output tokens per query.
Note
For this sample Agent use case, if there were an action group configured to use an external API, those costs would be additional. They are outside the scope of the calculations in this table.
AWS service | Dimensions | Cost [USD] |
---|---|---|
API Gateway (WebSocket), CloudFront, Lambda, Amazon S3, Systems Manager Parameter Store |
100 chat interactions per day, average message size 32 KB per message, 5 minutes per connection | $0.61 |
CloudWatch |
1.5 GB CloudWatch Logs with verbose mode on for experimentation | $7.23 |
DynamoDB |
LLM configuration table for 1KB record size and 1 GB storage | $0.25 |
Subtotal of costs (not including LLMs) | $8.09 | |
Anthropic Claude 3.5 Sonnet |
|
$24.30 |
Amazon Bedrock (Amazon Titan Text Embeddings v2) for Amazon Bedrock Knowledge Bases |
Daily cost for 190K input tokens per day (0.00002/1000 tokens) = 0.004 Daily cost × 30 days = $0.12 |
$0.12 |
Amazon OpenSearch Service (Serverless) sample usage |
Basic serverless configuration with 4 × OpenSearch Compute Unit (OCU) (billable minimum) = $23.04 per day Daily cost × 30 days = $691.20 |
$691.20 |
Amazon Bedrock Guardrails |
190K tokens is roughly equivalent of 760K (190,000 × 4) characters and 3,800 text units (760K characters / 200) Consider a guardrail configured with content filters, personally identifiable information (PII) filter, sensitive information filter (regular expression) and word filters Daily content filter cost (0.75/1000 text units) + PII filter cost ($0.1/1000 text units) + sensitive information filter (regex) + word filters = $2.85 + $0.38 + $0 + $0 Monthly cost = Daily cost × 30 days = $96.90 |
$96.90 |
Total application cost for an agent backed by Anthropic Claude 3.5 Sonnet |
$8.09 (use case cost) + $812.52 (other agent configurations) |
$820.61 |
Note
Refer to the pricing guide of your LLM provider if you’re not using an
AWS model provider. Pricing guides for AWS services can be found at:
Amazon Bedrock pricing