Advanced joins

Note This functionality is in alpha. Callers need to provide the cdf-version: alpha header to their requests.

Concepts

Advanced joins

An advanced join is an abstraction for matching data entities, and enable subject-matter experts to input their knowledge into the matching process. They work natively on top of Cognite Data Fusion data modeling capabilities.

In a first iteration (Alpha version), advanced joins support finding candidates, getting input on, and populating direct relations in views.

Matches

After creating the main advanced join object, the caller can input truth values by creating matches, which are attached to the main advanced join object.

A match indicates that two data model instances should be linked, by materializing the start and end of the direct relation between the two linked instances.

Matchers

The main feature of advanced joins is to run several matching processes, and pick the best results to write back to the original data model. The matching processes are described by matcher objects in the API. For now, we support only Raw matchers: the caller can write outputs from the matching process of their choice, like Entity Matching with different parameters.

Jobs

The API provides several capabilities once advanced joins and matches are created, like

  • measuring the proportion of mapped instances in the advanced join's view,
  • estimating the quality of a matcher using the matches (truth values) as reference,
  • suggesting data quality improvements by predicting which unmatched instances are most likely to increase data quality if matched,
  • running an advanced join, which means to populate the original instances' direct relation property with the best combination of matcher results and matches.