FindIncrementalMatches class - AWS Glue

FindIncrementalMatches class

Package: com.amazonaws.services.glue.ml

object FindIncrementalMatches

Def apply

apply(existingFrame: DynamicFrame, incrementalFrame: DynamicFrame, transformId: String, transformationContext: String = "", callSite: CallSite = CallSite("Not provided", ""), stageThreshold: Long = 0, totalThreshold: Long = 0, enforcedMatches: DynamicFrame = null): DynamicFrame, computeMatchConfidenceScores: Boolean

Find matches across the existing and incremental frames and return a new frame with a column containing a unique ID per match group.

  • existingframe — An existing frame which has been assigned a matching ID for each group. Required.

  • incrementalframe — An incremental frame used to find matches against the existing frame. Required.

  • transformId — A unique ID associated with the FindIncrementalMatches transform to apply on the input frames. Required.

  • transformationContext — Identifier for this DynamicFrame. The transformationContext is used as a key for the job bookmark state that is persisted across runs. Optional.

  • callSite — Used to provide context information for error reporting. These values are automatically set when calling from Python. Optional.

  • stageThreshold — The maximum number of error records allowed from the computation of this DynamicFrame before throwing an exception, excluding records present in the previous DynamicFrame. Optional. The default is zero.

  • totalThreshold — The maximum number of total errors records before an exception is thrown, including those from previous frames. Optional. The default is zero.

  • enforcedMatches — The frame for enforced matches. Optional. The default is null.

  • computeMatchConfidenceScores — A Boolean value indicating whether to compute a confidence score for each group of matching records. Optional. The default is false.

Returns a new dynamic frame with a unique identifier assigned to each group of matching records.