FindIncrementalMatches class
Package: com.amazonaws.services.glue.ml
object FindIncrementalMatches
Def apply
apply(existingFrame: DynamicFrame,
incrementalFrame: DynamicFrame,
transformId: String,
transformationContext: String = "",
callSite: CallSite = CallSite("Not provided", ""),
stageThreshold: Long = 0,
totalThreshold: Long = 0,
enforcedMatches: DynamicFrame = null): DynamicFrame,
computeMatchConfidenceScores: Boolean
Find matches across the existing and incremental frames and return a new frame with a column containing a unique ID per match group.
existingframe
— An existing frame which has been assigned a matching ID for each group. Required.incrementalframe
— An incremental frame used to find matches against the existing frame. Required.transformId
— A unique ID associated with the FindIncrementalMatches transform to apply on the input frames. Required.transformationContext
— Identifier for thisDynamicFrame
. ThetransformationContext
is used as a key for the job bookmark state that is persisted across runs. Optional.callSite
— Used to provide context information for error reporting. These values are automatically set when calling from Python. Optional.stageThreshold
— The maximum number of error records allowed from the computation of thisDynamicFrame
before throwing an exception, excluding records present in the previousDynamicFrame
. Optional. The default is zero.totalThreshold
— The maximum number of total errors records before an exception is thrown, including those from previous frames. Optional. The default is zero.enforcedMatches
— The frame for enforced matches. Optional. The default isnull
.computeMatchConfidenceScores
— A Boolean value indicating whether to compute a confidence score for each group of matching records. Optional. The default is false.
Returns a new dynamic frame with a unique identifier assigned to each group of matching records.