Cookie の設定を選択する

当社は、当社のサイトおよびサービスを提供するために必要な必須 Cookie および類似のツールを使用しています。当社は、パフォーマンス Cookie を使用して匿名の統計情報を収集することで、お客様が当社のサイトをどのように利用しているかを把握し、改善に役立てています。必須 Cookie は無効化できませんが、[カスタマイズ] または [拒否] をクリックしてパフォーマンス Cookie を拒否することはできます。

お客様が同意した場合、AWS および承認された第三者は、Cookie を使用して便利なサイト機能を提供したり、お客様の選択を記憶したり、関連する広告を含む関連コンテンツを表示したりします。すべての必須ではない Cookie を受け入れるか拒否するには、[受け入れる] または [拒否] をクリックしてください。より詳細な選択を行うには、[カスタマイズ] をクリックしてください。

Available checks - AWS Glue DataBrew
このページはお客様の言語に翻訳されていません。 翻訳のリクエスト

Available checks

The following table lists references for all available conditions that can be used in your rules. Note that aggregated conditions cannot be combined with non-aggregated conditions in the same rule.

Note

For SDK users, to apply the same rule to multiple columns use the ColumnSelectors attribute of a Rule and specify validated columns using either their names or a regular expression. In this case, you should use implicit CheckExpression. For example, “> :val” to compare values in each of the selected columns with the provided value. DataBrew uses implicit syntax for defining FilterExpression in dynamic datasets. If you want to specify column(s) for each check individually, don't set the ColumnSelectors attribute. Instead, provide an explicit expression. For example, “:col > :val” as a CheckExpression in a Rule.

Condition type Data quality check Additional parameters Comparison type SDK syntax example
Aggregate dataset conditions Number of rows Numeric comparison against custom value

"CheckExpression": "AGG(ROWS_COUNT) > :val", "SubstitutionMap": {":val", "10000"}

Number of columns Numeric comparison against custom value

"CheckExpression": "AGG(COLUMNS_COUNT) == :val", "SubstitutionMap": {":val", "20"}

Duplicate rows

Numeric comparison against custom value

"CheckExpression": "AGG(DUPLICATE_ROWS_COUNT) < :val", "SubstitutionMap": {":val", "100"}

or

"CheckExpression": "AGG(DUPLICATE_ROWS_PERCENTAGE) < :val", "SubstitutionMap": {":val", "5"}

Aggregate column statistics conditions Missing values Numeric comparison against custom value

"CheckExpression": "AGG(MISSING_VALUES_COUNT) < :val", "SubstitutionMap": {":val", "100"}

or

"CheckExpression": "AGG(MISSING_VALUES_PERCENTAGE) < :val", "SubstitutionMap": {":val", "5"}

Duplicate values Numeric comparison against custom value

"CheckExpression": "AGG(DUPLICATE_VALUES_COUNT) < :val", "SubstitutionMap": {":val", "100"}

or

"CheckExpression": "AGG(DUPLICATE_VALUES_PERCENTAGE) < :val", "SubstitutionMap": {":val", "5"}

Valid values

Numeric comparison against custom value

"CheckExpression": "AGG(VALID_VALUES_COUNT) > :val", "SubstitutionMap": {":val", "10000"}

or

"CheckExpression": "AGG(VALID_VALUES_PERCENTAGE) > :val", "SubstitutionMap": {":val", "95"}

Distinct values Numeric comparison against custom value

"CheckExpression": "AGG(DISTINCT_VALUES_COUNT) > :val", "SubstitutionMap": {":val", "1000"}

or

"CheckExpression": "AGG(DISTINCT_VALUES_PERCENTAGE) >= :val", "SubstitutionMap": {":val", "50"}

Unique values

Numeric comparison against custom value

"CheckExpression": "AGG(UNIQUE_VALUES_COUNT) > :val", "SubstitutionMap": {":val", "100"}

or

"CheckExpression": "AGG(UNIQUE_VALUES_PERCENTAGE) > :val", "SubstitutionMap": {":val", "20"}

Outliers

Z-score threshold Numeric comparison against custom value

"CheckExpression": "AGG(Z_SCORE_OUTLIERS_COUNT, :zscore_dev) < :val", "SubstitutionMap": {":zscore_dev": "4", ":val", "100"}

or

"CheckExpression": "AGG(Z_SCORE_OUTLIERS_PERCENTAGE) < :val", "SubstitutionMap": {":val", "5"}

Value distribution statistics

Statistics name (see next table) Numeric comparison against custom value

"CheckExpression": "AGG(<STAT_NAME>) < :val", "SubstitutionMap": {":val", "100"}

or

"CheckExpression": "AGG(<STAT_NAME>, :param) < :val", "SubstitutionMap": {":param": "0.25", :val", "5"}

Note

See next table for possible STAT_NAME values

Numerical statistics

Statistics name (see next table) Numeric comparison against custom value

"CheckExpression": "AGG(<STAT_NAME>) < :val", "SubstitutionMap": {":val", "100"}

or

"CheckExpression": "AGG(<STAT_NAME>, :param) < :val", "SubstitutionMap": {":param": "0.25", :val", "5"}

Note

See next table for possible STAT_NAME values

Non aggregate (accepts threshold) Value is exactly Exact comparison against a list of values

"CheckExpression": ":col IN :list", "SubstitutionMap": {":col": "`size`", ":list": "[\"S\",\"M\",\"L\",\"XL\"]"}

Value is not exactly Value shouldn't exactly match any value from a list

"CheckExpression": ":col NOT IN :list", "SubstitutionMap": {":col": "`domain`", ":list": "[\"GOV\",\"ORG\"]"}

String values String comparison against custom value or other string column

"CheckExpression": ":col STARTS_WITH :val", "SubstitutionMap": {":col": "`url`", ":val": "http"}

or

"CheckExpression": ":col1 contains :col2", "SubstitutionMap": {":col1": "`url`", ":col2": "`company_name`"}

Numeric values Numeric comparison against custom value or other numeric column

"CheckExpression": ":col IS_BETWEEN :val1 and :val2", "SubstitutionMap": {":col": "`APY`", ":val1": "0", ":val2": "10"}

or

"CheckExpression": ":col1 <= :col2", "SubstitutionMap": {":col1": "`bank_rate`", ":col2": "`fed_rate`"}

Value string length Numeric comparison against custom value or other numeric column

"CheckExpression": "length(:col) IS_BETWEEN :val1 and :val2", "SubstitutionMap": {":col": "`identifier`", ":val1": "8", ":val2": "12"}

or

"CheckExpression": "length(:col1) <= :col2", "SubstitutionMap": {":col1": "`name`", ":col2": "`max_name_len`"}

Numeric comparisons

DataBrew supports the following operations for numeric comparison: Is equals (==), Is not equals (!=), Less than (<), Less than equals (<=), Greater than (>), Greater than equals (>=) and Is between (is_between :val1 and :val2).

String comparisons

The following string comparisons are supported: Starts with, Doesn’t start with, Ends with, Doesn’t end with, Contains, Doesn’t contain, Is equals, Is not equals, Matches, Doesn’t match.

The following table displays available statistics that you can use for Value distribution statistics and Numerical statistics:

Data quality check Statistics name Additional parameters SDK syntax
Value distribution statistics Min "CheckExpression": "AGG(MAX) < :val", "SubstitutionMap": {":val", "100"}
Max "CheckExpression": "AGG(MIN) > :val", "SubstitutionMap": {":val", "0"}
Median "CheckExpression": "AGG(MEDIAN) >= :val", "SubstitutionMap": {":val", "50"}
Mean "CheckExpression": "AGG(MEAN) <= :val", "SubstitutionMap": {":val", "10"}
Mode "CheckExpression": "AGG(MODE) > :val", "SubstitutionMap": {":val", "0"}
Standard deviation "CheckExpression": "AGG(STANDARD_DEVIATION) > :val", "SubstitutionMap": {":val", "0"}
Entropy "CheckExpression": "AGG(ENTROPY) > :val", "SubstitutionMap": {":val", "0"}
Numerical statistics Sum "CheckExpression": "AGG(SUM) > :val", "SubstitutionMap": {":val", "0"}
Kurtosis "CheckExpression": "AGG(KURTOSIS) > :val", "SubstitutionMap": {":val", "0"}
Skewness "CheckExpression": "AGG(SKEWNESS) > :val", "SubstitutionMap": {":val", "0"}
Variance "CheckExpression": "AGG(VARIANCE) > :val", "SubstitutionMap": {":val", "0"}
Absolute deviation "CheckExpression": "AGG(MEDIAN_ABSOLUTE_DEVIATION) > :val", "SubstitutionMap": {":val", "0"}
Quantile Quantile: one of '0.25', '0.5', '0.75' "CheckExpression": "AGG(QUANTILE, :pct) > :val", "SubstitutionMap": {":pct": "0.25", ":val", "0"}
プライバシーサイト規約Cookie の設定
© 2025, Amazon Web Services, Inc. or its affiliates.All rights reserved.