Feature Processing
Amazon SageMaker Feature Store Feature Processing is a capability with which you can transform raw data into machine learning (ML) features. It provides you with a Feature Processor SDK with which you can transform and ingest data from batch data sources into your feature groups. With this capability, Feature Store takes care of the underlying infrastructure including provisioning the compute environments and creating and maintaining Pipelines to load and ingest data. This way you can focus on your feature processor definitions that includes a transformation function (for example, count of product views, mean of transaction value), sources (where to apply this transformation on), and sinks (where to write the computed feature values to).
Feature Processor pipeline is a Pipelines pipeline. As a Pipelines, you can also track scheduled Feature Processor pipelines with SageMaker lineage in the console. For more information on SageMaker Lineage, see Amazon SageMaker ML Lineage Tracking This includes tracking scheduled executions, visualizing lineage to trace features back to their data sources, and viewing shared feature processors in a single environment. For information on using Feature Store with the console, see View pipeline executions from the console.
Topics
- Feature Store Feature Processor SDK
- Running Feature Store Feature Processor remotely
- Creating and running Feature Store Feature Processor pipelines
- Scheduled and event based executions for Feature Processor pipelines
- Monitor Amazon SageMaker Feature Store Feature Processor pipelines
- IAM permissions and execution roles
- Feature Processor restrictions, limits, and quotas
- Data sources
- Example Feature Processing code for common use cases