Matching Agents for the MLMR

The matching agent stores the data steward's Clerical Review decisions and uses them to train a machine-learning model using a training background process (BGP). Based on the trained model, the matching agent provides merge and reject recommendations for all tasks in the Clerical Review Task List using a recommendation BGP.

These processes are described below.

Prerequisites

  1. Identify or configure a matching algorithm on which the Machine Learning Match Recommendations (MLMR) can function. For more information, refer to the Configuring Matching Algorithms topic here.

  2. Configure the REST gateways for the matching agent. For more information, refer to the Configuring the Stibo Aspire REST Gateway topic here.

  3. Configure the matching agent object type. For more information, refer to the Configuring the Matching Agent Object Type topic here.

Create Matching Agents

  1. In System Setup, navigate to the setup group used for matching algorithms, right-click the parent node, and select 'New Matching Agent.'

  2. In the Create Matching Agent dialog, set the name and ID.

  3. Click the ellipsis button () next to the Matching Algorithm parameter to select a matching algorithm.

    Note: Only matching algorithms using embedded match codes can be used.

  4. For the Gateway Integration Endpoint parameter, click the ellipsis button () and select the Stibo Aspire Gateway endpoint.

  5. Right-click the matching agent and select Enable Matching Agent.

    Note: Only one matching agent per matching algorithm can be enabled at a time. To use a different matching agent on the matching algorithm, first disable the active one, then enable the new one.

Using and Monitoring the Matching Agent

Once you have configured the matching agent, various statuses and statistics are available.

  • Enabled - Displays whether the matching agent is enabled or not. The matching agent collects clerical review decisions as long as it is enabled, regardless of what the Agent Status is.
  • Agent Status - Displays whether the matching agent is 'Running,' 'Stopped,' 'Failed,' or 'Failed (retrying).' This status reflects the result of the training and recommendation BGPs.

    • Running – The matching agent is running.

    • Stopped – The matching agent has not yet been enabled or has been disabled by a user.

    • Failed - The matching agent has stopped because of a failure.

    • Failed (retrying) - When the matching agent runs the recommendation BGP and a FailAndRetryException error (or a connectivity error) is thrown, the matching agent enters a ‘Failed (retrying)’ state. This means that if the reason for the error should not cause the process to stop and prompt review by the user, like an issue with connectivity, the system will attempt to self-recover and restart the process when the issue is resolved. The logic behind the ‘Failed (retrying)’ state restarts the entire training or recommendation process when the matching agent is moved into that state. When the matching agent enters the 'Failed (retrying)' state, the system will attempt to retry the process every minute for 2 hours. Then the system will retry every 10th minute until it succeeds, is manually ended, or a month passes, at which point the process fails.

  • Processing Comment - If the training or recommendation BGPs fail due to a connectivity error, the matching agent goes into the 'Failed (retrying)' state, and the Processing Comment displays a message with the first and last failures, the number of retries performed, and the next scheduled retry.

  • Training Status - A matching agent is initially 'Untrained,' and has not yet provided recommendations. After its first successful training, the Training Status displays 'Trained.'

  • Training Statistics - Displays statistics of the latest performed training including the total merge / reject and advanced merge decisions, as well as the time of training.

You can change the matching algorithm when the matching agent is disabled. If existing decisions are already stored on the matching agent, ensure that you use the same Golden Record object type. You can also change the gateway integration endpoint when the matching agent is disabled; however, under normal circumstances this is not necessary.

The matching agent performs certain BGPs to perform training and provide recommendations in the Clerical Review Task List.

Training process

The training BGP is responsible for using the data steward's Clerical Review decisions to train a machine-learning model. The BGP initially starts when the data steward makes a minimum of 30 merge and 30 reject decisions. Once complete, an increase of 10 percent more decisions will trigger a new training BGP. Once the training BGP finishes, the recommendation process automatically starts.

Recommendation process

The recommendation BGP processes the remaining Clerical Review tasks and provides merge or reject recommendations for them based on the trained machine-learning model.

Once the training and recommendation BGPs are complete, you can view recommendations in the Clerical Review Task List of your Web UI. For more information, refer to the Adding Match Recommendations to a Clerical Review Task topic here.

On the recommendation BGP, you can download a details.zip attachment file. The file contains record pair details of the Clerical Review tasks with certain machine-learning related values. This is for Stibo Systems to analyze the merge / reject recommendations on the individual Clerical Review tasks, should the need arise.

Decision cleanup process

Clerical review merge / reject actions store a copy of merged and rejected golden records on the matching agent associated with the matching algorithm that owns the clerical review task. This information is used for training the matching agent.

When golden records are purged from STEP, any copies stored on matching agents will also be purged. This process of cleaning up the decision data is scheduled to run as a BGP every seventh day.

Important: Any future training BGPs will not contain purged decisions. As a result, the recommendations generated after the training process will be different because of the lack of this training data, and all previous merge / reject recommendations based on these purged decisions will be lost.

Manually Do Training

You can manually initiate the training process at any time; however, this should only be done in special scenarios. Stibo Systems does not recommend manual training.

Important: Manual training before the minimum required decisions have been made can result in less accurate recommendations.

To manually perform the training and get new up-to-date merge / reject recommendations, right-click the matching agent and select 'Do Training.'

After the training background process has finished, the recommendation background process automatically starts. Then the data steward begins receiving recommendations.

Once the matching agent is configured and training is complete, users can view recommendations on the Clerical Review Task List of their Web UI. For more information, refer to the Adding Match Recommendations to a Clerical Review Task topic here.

Manually Do Recommendations

The matching agent has a right-click action 'Do Recommendations.' This manually starts the background process of getting merge / reject recommendations for all tasks in the Clerical Review Task List, based on the existing training. The Matching Event Processor ensures that all tasks are updated with an up-to-date recommendation. This action should only be used in special cases, such as if the recommendation process failed to provide recommendations on all tasks.

Note: After selecting the 'Do Training' action, the background process of getting merge / reject recommendations for all tasks in the Clerical Review Task List automatically starts, and the 'Do Recommendations' does not need to be manually selected.