You are viewing documentation for version 2 of the AWS SDK for Ruby. Version 3 documentation can be found here.
Class: Aws::Glue::Types::FindMatchesParameters
- Inherits:
-
Struct
- Object
- Struct
- Aws::Glue::Types::FindMatchesParameters
- Defined in:
- (unknown)
Overview
When passing FindMatchesParameters as input to an Aws::Client method, you can use a vanilla Hash:
{
primary_key_column_name: "ColumnNameString",
precision_recall_tradeoff: 1.0,
accuracy_cost_tradeoff: 1.0,
enforce_provided_labels: false,
}
The parameters to configure the find matches transform.
Returned by:
Instance Attribute Summary collapse
-
#accuracy_cost_tradeoff ⇒ Float
The value that is selected when tuning your transform for a balance between accuracy and cost.
-
#enforce_provided_labels ⇒ Boolean
The value to switch on or off to force the output to match the provided labels from users.
-
#precision_recall_tradeoff ⇒ Float
The value selected when tuning your transform for a balance between precision and recall.
-
#primary_key_column_name ⇒ String
The name of a column that uniquely identifies rows in the source table.
Instance Attribute Details
#accuracy_cost_tradeoff ⇒ Float
The value that is selected when tuning your transform for a balance
between accuracy and cost. A value of 0.5 means that the system balances
accuracy and cost concerns. A value of 1.0 means a bias purely for
accuracy, which typically results in a higher cost, sometimes
substantially higher. A value of 0.0 means a bias purely for cost, which
results in a less accurate FindMatches
transform, sometimes with
unacceptable accuracy.
Accuracy measures how well the transform finds true positives and true negatives. Increasing accuracy requires more machine resources and cost. But it also results in increased recall.
Cost measures how many compute resources, and thus money, are consumed to run the transform.
#enforce_provided_labels ⇒ Boolean
The value to switch on or off to force the output to match the provided
labels from users. If the value is True
, the find matches
transform
forces the output to match the provided labels. The results override the
normal conflation results. If the value is False
, the find matches
transform does not ensure all the labels provided are respected, and the
results rely on the trained model.
Note that setting this value to true may increase the conflation execution time.
#precision_recall_tradeoff ⇒ Float
The value selected when tuning your transform for a balance between precision and recall. A value of 0.5 means no preference; a value of 1.0 means a bias purely for precision, and a value of 0.0 means a bias for recall. Because this is a tradeoff, choosing values close to 1.0 means very low recall, and choosing values close to 0.0 results in very low precision.
The precision metric indicates how often your model is correct when it predicts a match.
The recall metric indicates that for an actual match, how often your model predicts the match.
#primary_key_column_name ⇒ String
The name of a column that uniquely identifies rows in the source table. Used to help identify matching records.