Identify Duplicates

The Identify Duplicates match action helps determine if duplicates exist in a dataset and allows users to manually confirm, reject, merge, and delete duplicates with limited impact on existing functionality.

Note: A matching algorithm using the Identify Duplicates match action only links records. While it is possible to set up workflows and UIs for manually merging the identified duplicate records in STEP, if those actions are needed, the Identify Duplicates match action is probably not the best choice. For match actions with configurable automatic actions, refer to the Match and Link (here) or Match and Merge (here) topics.

With the Identify Duplicates match action, as matchable objects are created and modified, events are sent to a matching event processor. In an asynchronous process, the Match Event Processor matches these objects with other matchable objects, as defined by the matching algorithm. When two objects score above the create threshold, a match result is stored for future handling.

Configuration

The Create Threshold parameter is required for the Identify Duplicates match action and specifies 'how equal' objects must be to be marked as possible duplicates.

Note: Identify duplicates uses many of the same workbench and Web UI tools as the match and link match action.

Identify Duplicates in Workbench

For information, refer to the Match and Link in Workbench topic here.

Identify Duplicates in Web UI

The Web UI supports actions on identified duplicates, as defined in these topics:

  • Potential Duplicates List topic here
  • Merging Confirmed Matches topic here
  • Confirmed Matches Component topic here.