Match and Merge

A match and merge solution takes ownership over the data and is well suited to data hub implementations with any degree of centralized or decentralized management of data.

For details on configuration, refer to the Match and Merge Traceability topic (here) and the Configuring Match and Merge topic (here).

In the following sections, an example of maintaining customer records in a match and merge solution is used to explain the match and merge data functionality.

Data Model

Unlike the Match and Link Match Action, in the Match and Merge Match Action the source record and golden record do not use separate object types. The source system is registered as an entity and the source relation is modeled as a reference from the golden record to that source system.

Note: When consolidating data, you must use the match and merge solution, not the match and link solution.

Information Flow

When a customer record is created or updated in an external system, the update is delivered to STEP via either a web service endpoint or an IIEP.

In both cases, the incoming source record is matched against the existing golden records, and if a match is found, the information from the source record is merged into the relevant golden record using survivorship rules. If this results in updated information, the customer record can be exported back to all external systems. In this way, an update to the customer record in any system can be automatically managed for trust and timeliness. This ensures the best possible view of the customer record is reflected across the entire ecosystem.

When a user updates the customer record in STEP, the update takes place on the golden record itself, and the new trusted record can be exported in the same way as before.

The matching process uses a 'match score' within three groups separated by thresholds to indicate the likelihood of a match. For more information on match scores, refer to the Match Scores topic here.

  • A match score above the auto merge threshold (the highest threshold) is considered a match and the system automatically merges the data. During import, this results in the incoming data being merged directly into the existing golden record. If updates make two existing records match above the auto merge threshold, the matching algorithm declares one of the records as the 'survivor' and deactivates the other record. Information from the incoming or deactivated record is merged into the surviving record based on the survivorship rules set on the matching algorithm.

  • A match score between the clerical review threshold and the auto threshold indicates a possible match. The two records are sent to the clerical review workflow so a user can determine if there is a match or not. The data steward manually confirms the two records are duplicates and merges them or confirms they are not duplicates and should be kept separate going forward.

  • A match score below the clerical review threshold (the lowest threshold) is considered a non-match.

As golden records are created or updated, the matching event processor continuously compares the golden record to other golden records in the system.

Even in the best organizations, accidents happen. When two records are merged accidentally, STEP has tools to help resolve the issue. In a data hub that is closely integrated with a multitude of source systems, the process of unmerge may require a range of activities in the workflow in addition to the actual unmerge Web UI. The Web UI unmerge uses both original source records from source systems, revision history, and the match algorithm survivorship rules to help the user determine which values belong to which records during an unmerge.

For detailed charts and explanations of how information flows in a match and merge solution, refer to the Match and Merge Flow Details topic here.