Amazon SageMaker Unified Studio is in preview release and is subject to change.
Visual ETL
SageMaker Data Processing Visual ETL feature allows data engineers, analysts, and scientists to create Extract, Transform, and Load (ETL) flows using an intuitive visual interface. With visual ETL, analytics users can easily discover, prepare, move, and integrate data from multiple sources. This simplifies the process of data manipulation and integration, enabling users to efficiently prepare data for analysis and reporting.
Visual ETL in Amazon SageMaker Unified Studio provides a drag-and-drop interface for building ETL flows, as well as the ability to author flows using generative AI. If Amazon Q is enabled, you can also author flows with a generative AI interface. You can easily connect to various data sources, apply transformations, and define target destinations without writing complex code.
You can use Visual ETL to implement solutions such as:
Data integration from multiple sources
Data cleansing and normalization
Creating data warehouses or data lakes
Preparing data for machine learning models
Automating regular data processing tasks
Authoring flows with Visual ETL utilizes AWS Glue interactive sessions Version 5.0.
Key features
Visual ETL offers several capabilities to streamline your data workflows:
Drag-and-Drop Interface: Create Visual ETL flows by simply dragging and connecting components on a canvas.
Wide Range of Data Connectors: Connect to various data sources and destinations, including databases, file systems, cloud storage, and APIs.
Extensive Transformation Library: Apply a variety of pre-built transformations to your data, such as filtering, aggregation, joining, and data type conversions.
Custom Transformations: Create and save custom transformations using SQL or Python for reuse in multiple flows.
Data Preview: Visualize your data at each step of the dat authoring process to ensure accuracy and data quality.
View scripts: View the code generated and choose to convert the flow to a notebook and continue authoring with code.
Code and compute configuration: Configuration panel to add code libraries and adjust the compute settings