Cost - Generative AI Application Builder on AWS

Cost

With this AWS Solution, you pay only for the resources you use and there are no minimum fees or setup charges. Users pay for the dashboard used to launch Generative AI use cases and, and for any use cases that are deployed. The cost of deployed use cases depends on the configurations. Example configurations:

  1. A simple Deployment dashboard which costs approximately $20 USD per month.

  2. A simple production-ready chatbot use case deployed with default settings running in US East (N. Virginia), powered by Amazon Bedrock without access to documents, which also costs around $20 USD per month.

  3. A scaled system in an Amazon VPC use case that supports 8000 queries per day over tens of thousands of documents, which costs around $1400 USD per month. The cost of the use case will vary depending on the configuration, such as Text use cases with different model providers, with or without Retrieval Augmented Generation (RAG) enabled, and so on.

Workload description Estimated cost (USD/month)

Sample cost for Deployment dashboard

$20/month

Sample costs for a text-based proof of concept

(includes Deployment dashboard and 1 Text use case, ~100 interactions per day)

$30/month

Sample costs for a highly scalable generative AI query engine

(Includes Deployment dashboard, 1 Text use case, and an Amazon Kendra Index for RAG upto 100K documents with ~8000 queries per day, with VPC enabled

$1400/month
Important

These examples are only intended to help you estimate the costs for your specific workloads. The use of different LLMs, configurations, or AWS services can change your costs (example, serverless/on-demand billing vs. provisioned/time-billed). To manage costs, we recommend creating a budget through  AWS Cost Explorer. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this solution.

Sample costs for running the Deployment dashboard only

The following table provides the cost breakdown for a Deployment dashboard with default parameters and 100 active users in the US East (N. Virginia) Region for one month, which will cost about $20/month.

AWS service Dimensions Cost [USD]
Amazon API Gateway, DynamoDB, Amazon CloudFront, Amazon S3, Lambda, AWS Systems Manager Parameter Store 5,000 512 KB REST API calls per month without caching enabled $1.97
Amazon Cognito 100 active users per month with advanced security features enabled and no users signing in through SAML or OIDC federation $5.55
AWS WAF 10,000 web requests across 1 web ACL and 7 defined rules without any rule groups $12.60
Total Deployment dashboard cost $20.12

Sample costs for a text-based proof of concept

A Deployment dashboard can have many use cases deployed at a given time. The following table shows the cost breakdown of a use case deployed without RAG for 1 business user performing 100 queries per day with the LLM. Queries are sent as a text message on the WebSocket and the response is streamed back as tokens with the assumption that streaming is enabled. With an Amazon Bedrock Titan Text Express model, the cost of running this use case is about $15/month.

AWS service Dimensions Cost [USD]
Amazon API Gateway (WebSocket), CloudFront, Lambda, Amazon S3, AWS Systems Manager Parameter Store 100 chat interactions per day. Average message size 32 KB per message and 5 minutes per connection. $0.61
Amazon CloudWatch 1.5 GB CloudWatch logs with verbose mode on for experimentation $7.23
Amazon DynamoDB

Conversation history table, 1 GB storage

LLM configuration table, 1 GB storage

$3.05
Subtotal of costs (not including LLMs) $10.89
Amazon Bedrock (Titan Text Express)

Assumptions for 100 interactions per day:

  • Monthly cost for 190K input tokens per day = $0.04 x 30

  • Monthly cost for 16K output tokens per day = $0.01 x 30

$1.50
Total application cost with Amazon Bedrock (Titan Text Express) $10.89 (Use Case cost) + $1.50 (Amazon Bedrock cost) $12.39
Note

The costs of inference calls made to services outside the AWS network are not included in these estimates. See the pricing guide of your LLM provider if not using an AWS service.

Pricing guides for AWS services can be found at: Amazon Bedrock pricing and Amazon SageMaker pricing.

Sample costs for a highly scalable generative AI query engine

The following table provides the cost breakdown of a RAG-enabled use case with a Kendra index supporting 8000 interactions/day. With Amazon Bedrock’s Titan Text Express model as the LLM, this use case costs about $1200/month

AWS service Dimensions Cost [USD]
Amazon API Gateway (WebSocket) 8000 chat interactions per day. Average message size 32 KB per message and 5 minutes per connection. $38.89
Amazon CloudFront 240,000 requests per month with 100 GB data transferred out to the internet and 1 GB data transferred out to the origin $8.76
Amazon Bedrock (Titan Text Express)

Assumptions:

Input tokens = promptTemplate (400) + context (400)+ chatHistory (1080) + query Input tokens (20)= 1,900

Output tokens = 160 (average)

With 8,000 transactions a day,

Daily Input Tokens cost (1,900 x 8,000 = 15,200,000 tokens x 0.0002/1000 price per token)

Daily Output Tokens cost (160 x 8,000 = 1,280,000 tokens x 0.0006/1000 price per token)

Monthly cost (($3.04 + $0.77) x 30)

$114.30
Amazon CloudWatch 24 metrics using 5 GB data ingested for logs and 1 dashboard $9.72
Amazon DynamoDB DynamoDB table to keep track of conversation history with each record up to 1 KB data, 8,000 read and writes per day $11.70
AWS Lambda

Container size - 128 MB, 512 MB ephemeral

storage, 2 Lambda functions used for authorization

Container size - 256 MB, 512 MB ephemeral storage, 5 requests per second with 20 seconds average compute time

$20.89
Amazon S3, AWS Secrets Manager, AWS Systems Manager Parameter Store 1 MB for storage of any use case artifacts $0.53
Total use case cost $204.79/month + knowledge base cost (see below)
Note
  • The costs of API calls made to any services outside of the AWS network are not included in these estimates. See the pricing guide of your LLM provider if not using Amazon Bedrock.

Costs for adding a knowledge base

Knowledge base costs will vary based on the type of knowledge base used, and (in the case of Bedrock) the backing vector store used by the knowledge base. Provisioning and managing the knowledge bases is outside of the scope of the solution.

Amazon Kendra

The solution can provision a Kendra index for you, or you can bring your own. The cost for running a configuration suited to the above highly scalable generative AI query engine is as follows:

AWS service Dimensions Cost [USD]
Amazon Kendra 0-8,000 queries a day and up to 100,000 documents with Amazon Kendra Enterprise Edition with 0-50 data sources

$1,008.00

Note

You can share the Amazon Kendra index between use cases, but this can drive up the number of queries per index. If this falls outside the Amazon Kendra Enterprise edition, additional charges will apply.

Knowledge bases for Amazon Bedrock

The solution does not manage or provision any resources related to knowledge bases for Amazon Bedrock. Amazon Bedrock does not incur cost for using the knowledge base feature itself, however you will be charged for the usage of the embedding model used by your use case on each query. Additionally, the backing vector store for your knowledge base (e.g. an index in Amazon OpenSearch Service, or a database inside Amazon Relational Database Service) will have an associated cost which cannot be provided or calculated here.

However, for the above highly scalable generative AI query engine scenario, the costs incurred by this service for calling the Amazon Bedrock embeddings model are as follows:

AWS service Dimensions Cost [USD]

Amazon Bedrock (Amazon Titan Text Embeddings)

8,000 queries a day with 1,900 input tokens per query = 15,200,000 tokens = $0.30 USD per day.

Daily cost x 30 days = $9.00 USD monthly cost

$9.00

Amazon OpenSearch Service (Serverless) Sample Usage

Basic serverless configuration with 4 x OpenSearch Compute Unit (OCU) (billable minimum) = $23.04 USD per day

Daily cost x 30 days = $691.20 USD

Note

This provides a rough estimate, as some workloads will require more OCUs, while customers with existing provisioned OpenSearch resources will incur less cost here.

$691.20

Total additional cost

$ 700.20

Incremental cost of enabling Amazon VPC for a use case

The following table provides the cost breakdown of enabling Amazon VPC for a use case deployed in two AZs.

AWS service Dimensions Cost [USD]
Amazon NAT Gateway Assumption: 2 AZ deployment, with a NAT Gateway in each AZ. 100 GB of data processed through NAT Gateway 730 hours, 100 GB data processed per month $74.70
AWS PrivateLink (VPC Endpoints)

Assumptions: 2 AZ deployment, with 1 private subnet in each AZ and 1 VPC Endpoint with 2 elastic network interfaces (ENIs).

6 VPC endpoints, 2 ENIs per VPC endpoint, 730 hours with 1,024 GB data processed in a month

$97.84
Public IPv4 address

Assumption: 2 AZ deployment, 1 public subnet in each AZ with a NAT Gateway in each public subnet. Each NAT Gateway configured with 1 active public IPv4.

2 active public IPv4 address x 730 hours in a month x $0.005 hourly charge = $7.3 USD

$7.30

Additional cost

(for Amazon VPC)

$179.93

Cost implications when using Provisioned Throughput

Provisioned throughput costs will vary based on the type of model you've provisioned and your commitment period as well as Model Units selected for the commitment period. There is an additional cost associated with using Provisioned Throughput. As an example, when using Anthropic Claude Instant or Claude 2.x models or Amazon Titan Text Express, your prices per hour would look like:

Anthropic models

Price per hour per model with no commitment

Price per hour per model unit for 1-month commitment

Price per hour per model unit for 6-month commitment

Claude Instant

$44.00

$39.60

$22.00

Claude 2.0/2.1

$70.00

$63.00

$35.00

Amazon Titan Text Express

$20.50

$18.40

$14.80

For more information and most up-to-date pricing, you can refer Bedrock Pricing.