Skip to main content

Viewing Project Home Screen

Let's circle back to the Project Creation Flow. After the user has mapped the details and "Run" the Project, it is sent as a job to the Cluster. It may take up to a minute for the Job to move to the processing queue and the progress display to appear on the screen. Once the Job starts, the user can see the progress through various stages of the Resolve project through progress bars with text information across the result areas.  

"Resolve Jobs" are complicated and computation-intensive and may run in the background for hours. The running of the Resolve Project model for the first time is an "Unsupervised Model Run" as it runs without the help/supervision of the user through the "Training Tasks."

Once the "Unsupervised Model" has run, the Project Dashboard shown below will automatically populate. Users can see a few sets of key stats and metrics as shown below across the marked areas:

  1. Manage Project Tasks: There are four types of Tasks generated by the system to assist the user in Training the ML model further:

    • Match Training: These are learning tasks that can improve the Entity Resolution model. The Entity Resolution model is where the system group instances across and within Datasets that appear to contain matching attribute values.

    • Merge Training: These are learning tasks to improve the "Golden Record" creation (merging) model. The "Mastering" (or Merging) model is where the system merges the best candidates for each attribute from the cluster to create a Master record.

    • Match Fixing: These only appear once the System has reached a threshold where doing more training is not likely to improve results. Also, these are remediation tasks where the System requests users to manually fix the lowest confidence matching results.

    • Merge Fixing: These only appear once the system has reached a threshold where doing more training is not likely to improve results. These are manual remediation tasks where the System will request that Users manually fix the lowest confidence merging results.

      Besides the count of tasks, one can see how many have the "Completed," "In Review," or "Pending Approval" status. This gives you an overall perspective of how much work is still required to move the project closer to completion.

  2. Entities Resolved: The results of the Entity Resolution (matching) model are displayed here. Here, one can view the confidence score stats, number of model runs, and summary stats on how much compression took place. Compression represents the ratio of original records to unique entities. For example, 500 unique entities found from a set of 1,000 records. In that case, the Compression Ratio is 2:1. From here, the Project's users (Admin, Reviewer, Approver) can directly access the "Match Training" tasks by clicking the "Train Model" icon or the "Match Fixing" tasks by clicking "Fix Issues."

  3. Entity Records Mastered: The "Golden Record" creation (Merging) results are available here. The user can view the Confidence score stats and the number of model runs. Project users (Admin, Reviewer, Approver) can similarly access "Merge Training" or "Fixing Tasks" by clicking "Train Model" and "Fix Issues," respectively.

By clicking the eye-glass icon marked (2) in the image above, the user is redirected to "View Resolution Results" screen covered by this section in detail. Additionally, after training the model, the System will display a set of generated "Training Tasks" on the left-hand panel. Similarly, the mastered records can be viewed by clicking in the eye-glass icon marked (3).

Both the "Entity Resolved" and "Entity Mastered" sections will show a line chart indicating the score progression through multiple runs. This indicates the project users need to run the Project more than once, completing the tasks as they go along, strengthening the model learning and consequent confidence score of predictions.