Data quality checks
Data quality is an integral yet often overlooked part of the data cleaning process. The following diagram shows how data quality checks fit into the data engineering automation and access control lifecycle.
The following table provides an overview of different data quality solutions based on use case.
Use case | Solution | Example |
No-code solution to add column-level or table-level quality conditions | Checks if all column values are between 1 and 12, or if a table or column is empty | |
Custom code added to an AWS Glue job or a no-code solution (in preview) to add column-level or table-level quality conditions | Checks if the column | |
Custom checks | ETL of choice, such as AWS Lambda | Checks if the value of column A is always greater than the corresponding value of column B and column C, or if the value of column |
Sophisticated solution with a metrics report, constraint validation, and constraint suggestions | Checks if the |