Visual ETL - Amazon SageMaker Unified Studio

Amazon SageMaker Unified Studio is in preview release and is subject to change.

Visual ETL

Data engineers, analysts, and scientists use visual ETL features to create extract, transform, and load (ETL) flows using an intuitive visual interface. With visual ETL, analytics users can discover, prepare, move, and integrate data from multiple sources. This simplifies the process of data manipulation and integration so that you can prepare data for analysis and reporting.

Visual ETL in Amazon SageMaker Unified Studio provides a drag-and-drop interface for building ETL flows and authoring flows with Amazon Q. You can connect to data sources, apply transformations, and define target destinations without writing complex code.

You can use Visual ETL to implement solutions such as:

  • Data integration from multiple sources

  • Data cleansing and normalization

  • Creating data warehouses or data lakes

  • Preparing data for machine learning models

  • Automating regular data processing tasks

Authoring flows with Visual ETL utilizes AWS Glue interactive sessions Version 5.0.

Key features

Visual ETL offers several capabilities to streamline your data workflows:

  1. Drag-and-drop interface: Create Visual ETL flows by dragging and connecting components on a canvas.

  2. Wide range of data connectors: Connect to various data sources and destinations, including databases, file systems, cloud storage, and APIs.

  3. Extensive transformation library: Apply a variety of pre-built transformations to your data, such as filtering, aggregation, joining, and data type conversions.

  4. Custom transformations: Create and save custom transformations using SQL or Python for reuse in multiple flows.

  5. Data preview: Visualize your data at each step of the authoring process to ensure accuracy and data quality.

  6. View scripts: View the code generated and choose to convert the flow to a notebook and continue authoring with code.

  7. Code and compute configuration: Use a configuration panel to add code libraries and adjust the compute settings.