IIEP - Configure Match and Merge Importer Processing Engine
The STEP Match and Merge Importer plays an integral role in the Matching and Merging deduplication solution, and is responsible for importing, processing, and standardizing data, and ultimately matching that data with existing golden records. Post-processing is not supported by the Match and Merge Importer.
Match and Merge Importing Process and Logic
Data can flow into STEP via an asynchronous inbound integration endpoint (IIEP). The IIEP is designed to receive large batches of source records from any of a number of Receiver plugins.
The incoming source data is translated into STEPXML import files. These input files are typically handled one at a time in sequence, according to the parallel settings of the IIEP queue, as defined in the IIEP - Configure Endpoint topic of the Data Exchange documentation here. The result of the import operation is logged in workbench on the IIEP configuration's Background Processes tab and on the background process execution log.
Any failed records are stored on the BGP in a separate error file which allows the failed updates to be reattempted when errors have been corrected.
For details on configuring a data exchange method, either asynchronous or synchronous, for match and merge, refer to the Configuring the Match Data Exchange Method topic in the Matching, Linking, and Merging documentation here.
Match and Merge IIEP Configuration
Configuration of the STEP Match and Merge Importer begins on the 'Configure Endpoint' step of the Inbound Integration Endpoint wizard.
- On the Configure Endpoint step, click the 'Processing Engine' dropdown and select 'STEP Match and Merge Importer.'
Note: For this processing engine, the Transactional settings and Maximum number of waiting processes parameters are pre-configured and cannot be changed.
For details on how the other parameters in this step may be configured, refer to the IIEP - Configure Endpoint documentation here.
- On the Configure Pre-processor step, an existing import configuration can be used to transform Excel / CSV files into STEPXML, so that the processing engine can read those files.
- For the 'Configure Pre-processor' dropdown, select 'Transformation by Import Configuration.'
When using this option, the relevant import configuration may contain Excel / CSV mappings to STEPXML. If these files contain columns for Source System and Source Record ID, then the Reference Meta-Data option must be used to map the Source System to the relevant References Type, and the Source Record ID must be mapped to the relevant meta-data attribute on the Reference Type in question.
- For the 'Import Configuration' parameter, click the ellipsis button () and browse or search for an import configuration to use for transforming incoming Excel / CSV files.
- For the 'Allow open ZIP file' parameter, select 'Yes' from the dropdown to allow the importer to open ZIP files and convert the files within.
If STEP IDs are not typically available in your import files, it is recommended to use an import configuration that can create entity-to-entity references via Source Record ID.
For more information on Excel / CSV imports, refer to the Entity Reference via Source Record ID - Map Inbound section of the Data Exchange documentation here.
For more information on XML imports, refer to the Source Record ID in STEPXML section below.
Note: If the background process with this pre-processor fails and you want to restart it, the number of items mapped by import configuration can increase according to the previous run. To avoid this, disable the inbound integration endpoint, enable it again, and restart the import from the configured receiver.
- On the Configure Processing Engine step, business actions can be configured to process the temporary STEP source objects in order to standardize the data, which allows for more accurate matching.
For both of the parameters, click the link again to add items. Execution is performed in the order displayed. Click the ellipsis button () to change an item, click the arrows to change the order, and click the X button to remove an item.
-
If a business action is required, click the Add action link and browse or search for the relevant business action.
Important: The selected business action is called on the incoming data in a non-persistent form. Many of the operations available in the API are not applicable to non-persistent objects and will fail. You can successfully use operations related to reading and modifying attribute values, references, and data containers, however, approval and workflow-related operations cannot be used successfully.
-
In the 'Matching Algorithms' parameter, click the Add Matching Algorithm link and browse or search for a matching algorithm. The matching algorithms configured are used to match the inbound source records against existing golden records and invoke survivorship rules.
Source Record ID in STEPXML
The Source Record ID of a given record is stored in the Source Record ID Attribute of that record. This attribute is a metadata attribute placed on the record’s Source Relation Reference, which is a reference between the source record and the Source System in question.
The following is a STEPXML example of the relevant tags:
<Entity>
<Name>Carl Lucas</Name>
<Values>
<Value AttributeID="FirstName">Carl</Value>
<Value AttributeID="MiddleName"></Value>
<Value AttributeID="LastName">Lucas</Value>
<Value AttributeID="Email">l.carl@email.com</Value>
<Value AttributeID="PhoneNo"></Value>
</Values>
<EntityCrossReference EntityID="SourceSystem_Test" Type="MergeSourceRelation">
<MetaData>
<Value AttributeID="SourceRecordID">SST_00001</Value>
</MetaData>
</EntityCrossReference>
<EntityCrossReference Type="CustomerToOrganization">
<Entity>
<EntityCrossReference EntityID="SAP_Test" Type="MergeSourceRelation">
<MetaData>
<Value AttributeID="SourceRecordID">SSB_00002</Value>
</MetaData>
</EntityCrossReference>
</Entity>
</EntityCrossReference>
</Entity>