ColumnDataType
Checks the inherent data type of the values in a given column against the provided expected type. Accepts a with threshold
expression to check for a subset of the values in the column.
Syntax
ColumnDataType
<COL_NAME>
=<EXPECTED_TYPE>
COL_NAME – The name of the column that you want to evaluate the data quality rule against.
Supported column types: String type
Supported column types: Byte, Decimal, Double, Float, Integer, Long, Short
EXPECTED_TYPE – The expected type of the values in the column.
Supported values: Boolean, Date, Timestamp, Integer, Double, Float, Long
Supported column types: Byte, Decimal, Double, Float, Integer, Long, Short
EXPRESSION – An optional expression to specify the percentage of values that should be of the expected type.
Supported column types: Byte, Decimal, Double, Float, Integer, Long, Short
Example: Column data type integers as strings
The following example rule checks whether the values in the given column, which is of type string, are actually integers.
ColumnDataType "colA" = "INTEGER"
Example: Column data type integers as strings check for a subset of the values
The following example rule checks whether more than 90% of the values in the given column, which is of type string, are actually integers.
ColumnDataType "colA" = "INTEGER" with threshold > 0.9