Sample costs for running the Deployment dashboard Sample costs for a text-based proof of concept Sample costs for a highly scalable generative AI query engine Costs for adding a knowledge base Incremental cost of enabling Amazon VPC for a use case Cost implications when using Provisioned Throughput Cost for using cross-region inference Sample costs for an agent-based proof of concept

Cost

With this AWS Solution, you pay only for the resources you use and there are no minimum fees or setup charges. Users pay for the dashboard used to launch Generative AI use cases and, and for any use cases that are deployed. The cost of deployed use cases depends on the configurations. Example configurations:

A simple Deployment dashboard which costs approximately $20 USD per month.
A simple production-ready chatbot use case deployed with default settings running in US East (N. Virginia), powered by Amazon Bedrock without access to documents, which also costs around $200 USD per month.
A scaled system in an Amazon VPC use case that supports 8,000 queries per day over tens of thousands of documents, which costs around $1,400 USD per month. The cost of the use case will vary depending on the configuration, such as Text use cases with different model providers, with or without Retrieval Augmented Generation (RAG) enabled, and so on.

Workload description	Estimated cost (USD/month)
Sample cost for Deployment dashboard	$20/month
Sample costs for a text-based proof of concept (includes Deployment dashboard and 1 Text use case, ~100 interactions per day)	$40/month
Sample costs for a highly scalable generative AI query engine (Includes Deployment dashboard, 1 Text use case, and an Amazon Kendra Index for RAG up to 100K documents with ~8K queries per day, with VPC enabled	$1,400/month
Sample costs for an agent-based proof of concept (Includes Deployment dashboard, 1 Agent use case with Amazon Bedrock Knowledge Bases and Amazon Bedrock Guardrails enabled, ~100 interactions per day)	$840/month

Important

These examples are only intended to help you estimate the costs for your specific workloads. The use of different LLMs, configurations, or AWS services can change your costs (example, serverless/on-demand billing vs. provisioned/time-billed). To manage costs, we recommend creating a budget through AWS Cost Explorer. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this solution.

Sample costs for running the Deployment dashboard

The following table provides the cost breakdown for a Deployment dashboard with default parameters and 100 active users in the US East (N. Virginia) Region for one month, which will cost about $20/month.

AWS service	Dimensions	Cost [USD]
API Gateway, DynamoDB, CloudFront, Amazon S3, Lambda, Systems Manager Parameter Store	5,000 512 KB REST API calls per month without caching enabled	$1.97
Amazon Cognito	100 active users per month with advanced security features enabled and no users signing in through SAML or OIDC federation	$5.55
AWS WAF	10,000 web requests across 1 web ACL and 7 defined rules without any rule groups	$12.60
Total Deployment dashboard cost		$20.12

Sample costs for a text-based proof of concept

A Deployment dashboard can have many use cases deployed at a given time. The following table shows the cost breakdown of a use case deployed without RAG for 1 business user performing 100 queries per day with the LLM. Queries are sent as a text message on the WebSocket and the response is streamed back as tokens with the assumption that streaming is enabled. With an Amazon Bedrock Titan Text Express model, the cost of running this use case is about $15/month.

AWS service	Dimensions	Cost [USD]
API Gateway (WebSocket), CloudFront, Lambda, Amazon S3, AWS Systems Manager Parameter Store	100 chat interactions per day. Average message size 32 KB per message and 5 minutes per connection.	$0.61
CloudWatch	1.5 GB CloudWatch logs with verbose mode on for experimentation	$7.23
Amazon DynamoDB	Conversation history table, 1 GB storage LLM configuration table, 1 GB storage	$3.05
Subtotal of the use case costs (not including LLMs)		$10.89
Amazon Bedrock (Titan Text Express)	Assumptions for 100 interactions per day: * Monthly cost for 190K input tokens per day = $0.04 × 30 * Monthly cost for 16K output tokens per day = $0.01 × 30	$1.50
Total application cost with Amazon Bedrock (Titan Text Express)	$10.89 (Use Case cost) + $1.50 (Amazon Bedrock cost)	$12.39

Note

The costs of inference calls made to services outside the AWS network are not included in these estimates. Refer to the pricing guide of your LLM provider if you’re not using an AWS model provider.

Pricing guides for AWS services can be found at: Amazon Bedrock pricing and Amazon SageMaker AI pricing.

Sample costs for a highly scalable generative AI query engine

The following table provides the cost breakdown of a RAG-enabled use case with a Kendra index supporting 8000 interactions/day. With Amazon Bedrock’s Titan Text Express model as the LLM, this use case costs about $1200/month

AWS service	Dimensions	Cost [USD]
API Gateway (WebSocket)	8000 chat interactions per day. Average message size 32 KB per message and 5 minutes per connection.	$38.89
CloudFront	240,000 requests per month with 100 GB data transferred out to the internet and 1 GB data transferred out to the origin	$8.76
Amazon Bedrock (Titan Text Express)	Assumptions: Input tokens = promptTemplate (400) + context (400)+ chatHistory (1080) + query Input tokens (20)= 1,900 Output tokens = 160 (average) With 8,000 transactions a day, Daily Input Tokens cost (1,900 x 8,000 = 15,200,000 tokens x 0.0002/1000 price per token) Daily Output Tokens cost (160 x 8,000 = 1,280,000 tokens x 0.0006/1000 price per token) Monthly cost (($3.04 + $0.77) x 30)	$114.30
CloudWatch	24 metrics using 5 GB data ingested for logs and 1 dashboard	$9.72
DynamoDB	DynamoDB table to keep track of conversation history with each record up to 1 KB data, 8,000 read and writes per day	$11.70
Lambda	Container size - 128 MB, 512 MB ephemeral storage, 2 Lambda functions used for authorization Container size - 256 MB, 512 MB ephemeral storage, 5 requests per second with 20 seconds average compute time	$20.89
Total use case cost		$204.26/month + knowledge base cost (see below)

Note

The costs of API calls made to any services outside of the AWS network are not included in these estimates. See the pricing guide of your LLM provider if not using Amazon Bedrock.

Costs for adding a knowledge base

Knowledge base costs will vary based on the type of knowledge base used, and (in the case of Bedrock) the backing vector store used by the knowledge base. Provisioning and managing the knowledge bases is outside of the scope of the solution.

Amazon Kendra

The solution can provision a Kendra index for you, or you can bring your own. The cost for running a configuration suited to the above highly scalable generative AI query engine is as follows:

AWS service	Dimensions	Cost [USD]
Amazon Kendra	0-8,000 queries a day and up to 100,000 documents with Amazon Kendra Enterprise Edition with 0-50 data sources	$1,008.00

Note

You can share the Amazon Kendra index between use cases, but this can drive up the number of queries per index. If this falls outside the Amazon Kendra Enterprise edition, additional charges will apply.

Amazon Bedrock Knowledge Bases

The solution does not manage or provision any resources related to Amazon Bedrock Knowledge Bases. Amazon Bedrock does not incur cost for using the knowledge base feature itself, however you will be charged for the usage of the embedding model used by your use case on each query. Additionally, the backing vector store for your knowledge base (for example, an index in Amazon OpenSearch Service, or a database inside Amazon Relational Database Service) will have an associated cost which cannot be provided or calculated here.

For the above highly scalable generative AI query engine scenario, the costs incurred by this service for calling the Amazon Bedrock embeddings model are as follows:

AWS service	Dimensions	Cost [USD]
Amazon Bedrock (Amazon Titan Text Embeddings)	8,000 queries a day with 1,900 input tokens per query = 15,200,000 tokens = $0.30 USD per day. Daily cost x 30 days = $9.00 USD monthly cost	$9.00
Amazon OpenSearch Service (Serverless) Sample Usage	Basic serverless configuration with 4 x OpenSearch Compute Unit (OCU) (billable minimum) = $23.04 USD per day Daily cost x 30 days = $691.20 USD [NOTE] ==== This provides a rough estimate, as some workloads will require more OCUs, while customers with existing provisioned OpenSearch resources will incur less cost here. ====	$691.20
Total additional cost		$ 700.20

AWS service

Dimensions

Cost [USD]

Amazon Bedrock (Amazon Titan Text Embeddings)

8,000 queries a day with 1,900 input tokens per query = 15,200,000 tokens = $0.30 USD per day.

Daily cost x 30 days = $9.00 USD monthly cost

$9.00

Amazon OpenSearch Service (Serverless) Sample Usage

Basic serverless configuration with 4 x OpenSearch Compute Unit (OCU) (billable minimum) = $23.04 USD per day

Daily cost x 30 days = $691.20 USD

[NOTE] ==== This provides a rough estimate, as some workloads will require more OCUs, while customers with existing provisioned OpenSearch resources will incur less cost here. ====

$691.20

Total additional cost

$ 700.20

Incremental cost of enabling Amazon VPC for a use case

The following table provides the cost breakdown of enabling Amazon VPC for a use case deployed in two AZs.

AWS service	Dimensions	Cost [USD]
Amazon NAT Gateway	Assumption: 2 AZ deployment, with a NAT Gateway in each AZ. 100 GB of data processed through NAT Gateway 730 hours, 100 GB data processed per month	$74.70
AWS PrivateLink (VPC Endpoints)	Assumptions: 2 AZ deployment, with 1 private subnet in each AZ and 1 VPC Endpoint with 2 elastic network interfaces (ENIs). 6 VPC endpoints, 2 ENIs per VPC endpoint, 730 hours with 1,024 GB data processed in a month	$97.84
Public IPv4 address	Assumption: 2 AZ deployment, 1 public subnet in each AZ with a NAT Gateway in each public subnet. Each NAT Gateway configured with 1 active public IPv4. 2 active public IPv4 address x 730 hours in a month x $0.005 hourly charge = $7.3 USD	$7.30
Additional cost (for Amazon VPC)		$179.93

Cost implications when using Provisioned Throughput

Provisioned throughput costs will vary based on the type of model you’ve provisioned and your commitment period as well as Model Units selected for the commitment period. There is an additional cost associated with using Provisioned Throughput. As an example, when using Anthropic Claude Instant or Claude 2.x models or Amazon Titan Text Express, your prices per hour would look like:

Anthropic models	Price per hour per model with no commitment	Price per hour per model unit for 1-month commitment	Price per hour per model unit for 6-month commitment
Claude Instant	$44.00	$39.60	$22.00
Claude 2.0/2.1	$70.00	$63.00	$35.00
Amazon Titan Text Express	$20.50	$18.40	$14.80

For more information and most up-to-date pricing, you can refer Bedrock Pricing.

Cost for using cross-region inference

There is no additional cost for routing or data transfer for using cross-region inference. You pay the same price per token for models as in your source or primary Region.

Sample costs for an agent-based proof of concept

When you use Amazon Bedrock Agents, you’re charged based on the components comprising the agent, such as the backing model and knowledge base (if RAG is enabled), along with additional capabilities that you add. The following table shows the cost breakdown of an Agent use case configured with an on-demand Claude 3.5 Sonnet model, Amazon Bedrock Knowledge Bases, and Amazon Bedrock Guardrails.

Similar to the cost for adding Amazon Bedrock Knowledge Bases, this solution doesn’t manage or provision resources related to Amazon Bedrock Agents. The solution also doesn’t incur cost for using Amazon Bedrock Knowledge Bases, but does incur cost for:

Using the embedding model for each query that is sent to it
The backing vector store for your knowledge base (for example, an index in Amazon OpenSearch Service, or a database inside Amazon RDS)

The following table assumes 100 interactions per day with 1,900 input tokens and 160 output tokens per query.

Note

For this sample Agent use case, if there were an action group configured to use an external API, those costs would be additional. They are outside the scope of the calculations in this table.

AWS service	Dimensions	Cost [USD]
API Gateway (WebSocket), CloudFront, Lambda, Amazon S3, Systems Manager Parameter Store	100 chat interactions per day, average message size 32 KB per message, 5 minutes per connection	$0.61
CloudWatch	1.5 GB CloudWatch Logs with verbose mode on for experimentation	$7.23
DynamoDB	LLM conﬁguration table for 1KB record size and 1 GB storage	$0.25
Subtotal of costs (not including LLMs)		$8.09
Anthropic Claude 3.5 Sonnet	* Daily cost for 190K input tokens per day (0.003/1,000 tokens) = $0.57 + Daily cost × 30 days = $17.10 * Daily cost for 16K output tokens per day (0.015/1,000 tokens) = $0.24 + Daily cost × 30 days = $7.20	$24.30
Amazon Bedrock (Amazon Titan Text Embeddings v2) for Amazon Bedrock Knowledge Bases	Daily cost for 190K input tokens per day (0.00002/1000 tokens) = 0.004 Daily cost × 30 days = $0.12	$0.12
Amazon OpenSearch Service (Serverless) sample usage	Basic serverless conﬁguration with 4 × OpenSearch Compute Unit (OCU) (billable minimum) = $23.04 per day Daily cost × 30 days = $691.20	$691.20
Amazon Bedrock Guardrails	190K tokens is roughly equivalent of 760K (190,000 × 4) characters and 3,800 text units (760K characters / 200) Consider a guardrail configured with content filters, personally identifiable information (PII) filter, sensitive information filter (regular expression) and word filters Daily content filter cost (0.75/1000 text units) + PII filter cost ($0.1/1000 text units) + sensitive information filter (regex) + word filters = $2.85 + $0.38 + $0 + $0 Monthly cost = Daily cost × 30 days = $96.90	$96.90
Total application cost for an agent backed by Anthropic Claude 3.5 Sonnet	$8.09 (use case cost) + $812.52 (other agent configurations)	$820.61

Note

Refer to the pricing guide of your LLM provider if you’re not using an AWS model provider. Pricing guides for AWS services can be found at: Amazon Bedrock pricing and Amazon SageMaker AI pricing.

Warning Javascript is disabled or is unavailable in your browser.

To use the Amazon Web Services Documentation, Javascript must be enabled. Please refer to your browser's Help pages for instructions.

Document Conventions

Plan your deployment

Security