Throughput modes
Amazon SageMaker Feature Store provides two pricing models to choose from: on-demand (On-demand
) and
provisioned (Provisioned
) throughput modes. On-demand
works best for
less predictable traffic, while Provisioned
works best for consistent and
predictable traffic.
You have the option to switch between On-demand
and Provisioned
throughput modes for a given feature group, to accommodate periods in which application traffic
patterns are changing or less predictable. You can only update your feature group throughput
mode to On-demand
once in a 24 hour period. The throughput mode can be updated
programmatically using the UpdateFeatureGroup API
or through the console UI. For more information about using the console, see Using Amazon SageMaker Feature Store in the console.
You can use the Provisioned
throughput mode with offline-only feature groups or
feature groups with the Standard
storage type. For other storage configurations,
the On-demand
throughput mode is used. For information about the online and offline
storage configurations, see Online store and Offline
store, respectively.
For more details about pricing, see Amazon SageMaker AI Pricing
Topics
On-demand throughput mode
The On-demand
(default) throughput mode works best when you are using feature
groups with unknown workload, unpredictable application traffic, and you cannot forecast the
capacity requirements.
The On-demand
mode charges you for the reads and writes that your application
performs on your feature groups. You do not need to specify how much read and write throughput
you expect your application to perform because Feature Store instantly accommodates your workloads as
they ramp up or down. You pay only for what you use, which is measured in
ReadRequestsUnits
and WriteRequestsUnits
.
You can enable the On-demand
throughput mode using the CreateFeatureGroup or UpdateFeatureGroup
APIs or through the console UI. For more information about using the console UI, see Using Amazon SageMaker Feature Store in the console.
Important
You can only update your feature group throughput mode to On-demand
once in
a 24 hour period.
Provisioned throughput mode
The Provisioned
throughput mode works best when you are using feature groups
with predictable workloads and you can forecast the capacity requirements to control costs.
This can make it more cost effective for certain workloads where you can anticipate throughput
requirements in advance.
When you set a feature group to Provisioned
mode, you specify capacity units
which are the maximum amount of capacity that an application can consume from a feature group.
If your application exceeds this Provisioned
throughput capacity, it is subject
to request throttling.
The following includes information about the read and write capacity units.
-
Retrieving a single record of up to 4 KB using the
GetRecord
API will consume at least 1 RCU (read capacity unit). Retrieving larger payloads may take more. The total number of read capacity units required depends on the item size, including a small per record metadata added by the Feature Store service. -
A single write request with a payload of 1 KB using the
PutRecord
API will consume at least 1 WCU (write capacity unit), with fractional payloads rounded up to nearest KB. It may consume more depending on the event time, deletion status of the record, and time to live (TTL) status. For more information about TTL, see Time to live (TTL) duration for records.
Important
When setting your capacity units please consider the following:
-
You will be charged for the read and write capacities you provision for your feature group, even if you do not fully utilize the
Provisioned
capacity. -
If you set a read or write capacity too low, your requests may experience throttling.
-
In some cases, records may consume an extra capacity unit due to record level metadata that is added by the Feature Store service to enable various features.
-
Retrieving only a subset of features using
GetRecord
orBatchGetRecord
APIs will still consume RCU corresponding to the entire record. -
For write capacity, you should provision 2x the recent peak capacity to avoid throttling when performing backfills or bulk ingestion that may result in a large number of historical record writes. This is because writing historical records consumes additional write capacity.
-
Feature Store does not currently support auto scaling for
Provisioned
mode.
You can enable the On-demand
throughput mode using the CreateFeatureGroup or UpdateFeatureGroup
APIs or through the console UI. For more information about using the console UI, see Using Amazon SageMaker Feature Store in the console.
The following describes how you can increase or decrease the RCU and WCU throughput for
your feature groups when Provisioned
mode is enabled.
Increasing provisioned throughput
You can increase RCU or WCU as often as needed using the UpdateFeatureGroup API or the console UI.
Decreasing provisioned throughput
You can decrease RCU and WCU (or both) for feature groups using UpdateFeatureGroup API or the console UI.
There is a default quota on the number of Provisioned
capacity decreases you
can perform on your feature group per day. A day is defined according to Universal Time
Coordinated (UTC). On a given day, you can start by performing up to four decreases within one
hour as long as you have not performed any other decreases yet during that day. Subsequently,
you can perform one additional decrease per hour as long as there were no decreases in the
preceding hour. This effectively brings the maximum number of decreases in a day to 27 times
(4 decreases in the first hour, and 1 decrease for each of the subsequent 1-hour windows in a
day).
Throughput mode metrics
A feature group in On-demand
mode will emit
ConsumedReadRequestsUnits
and ConsumedWriteRequestsUnits
metrics.
A feature group in Provisioned
mode will emit
ConsumedReadCapacityUnits
and ConsumedWriteCapacityUnits
metrics.
For more information about Feature Store metrics, see Amazon SageMaker Feature Store metrics.
Throughput mode limits
Each AWS account has default service quotas or limits that are applied to help ensure availability and manage billing risks. For information about the default quotas and limits, see Quotas, naming rules and data types.
In some cases, these limits may be lower than what is stated in the documentation. If you need higher limits, you can submit a request for an increase. It's a good idea to do so before reaching current limits to avoid interruptions to your work. For information about service quotas and how to request a quota increase, see AWS service quotas.