Creating a Dedupe+ Job
A dedupe+ job is a way of taking a dedupe+ rule and applying it across either all records or a defined subset in order to locate and solve duplicates across the whole system.
Dedupe+ Jobs are stored as entities and at the point in which you save, it will communicate with the Data8 servers and process the job accordingly.
Working down the entity in order, it breaks down as the following:
- Name: This is a friendly name for your reference only.
- Owner: This is the owner of the job, again for your reference only.
- Dedupe Ruleset: This is a lookup to the rule set you would like to run against the data.
- Automerge: This is a boolean option where you can enable the results to be automatically merged. This defaults to "No" and should only be set to "Yes" once you are fully happy with your Dedupe+ and Merge+ rules. If you are running a job with Automerge enabled, we would always recommend creating a back up before as we will not be able to revert our changes.
- Filtered Master View: Duplicare defines a master as the pot in which duplicates will be looked for. This enables you to reduce that pot so you will only look for duplicates within a subset - for example you only want to find duplicates within your "Active Contacts". The drop down box for Filtered Master View is populated by both system views and your user views. If left blank, it
- Filtered Candidate View: Duplicare defines a candidate as the pot of records which we would like to see is duplicates. This enables you to reduce that pot - for example find duplicates of "Leads created on last 24 hours". The drop down box for Filtered Candidate View is populated by both system views and your user views.
- Store Duplicate ID: This is optional and defaults to "No". If you select "Yes", the results of the dedupe+ will populate the "Duplicate Detected ID" field on the respective entity, enabling to you to duplicate analysis in a data tool of your choice. Setting this to "Yes" will slow the job down as duplicare will have to apply an update to all affected records - it will also affect the modifiedon values of those records.
- Assignment Strategy: Once the Dedupe+ Job is complete and groups of duplicates have been identified, one or more people will need to work through those groups to merge the records. If you have a large number of duplicates and need to split this work across multiple people, use this setting to have the groups automatically assigned to the relevant users for review.
Once you press save, it will save the record with a default status reason of "Pending". It can take out system up to 5 minutes to pick up the job and you'll know we have picked it up as the status will have updated to "Processing".
Processing the Results
Once the job is complete, the status will change to "Completed" and if you open the dedupe+ job record, you'll now see a new "Results" tab is visible.
Within this results tab, you have some statistics and some groupings of duplicates. The top line figures are for your reference and will help you understand the level of duplicates you have within your data. Additionally, the "Potential GDPR Red Flags" count you see is to highlight a group of duplicates that contains a conflict of consent i.e. 1 record is "Allow" emails and one is "Do Not Allow". This only refers to the out of the box marketing permission fields and at this time cannot be customised.
As well as the statistics, duplicare will have imported "groups". A group is 2 or more records we think is a duplicate and it's your job to work through those groups and select if it is a duplicate or not and apply that merge. You have 2 options here:
- Pick a specific group from the bottom grid and by double clicking it, it will open the merging window.
- Select "Resolve Duplicates" on the top ribbon. This will open the very first group and will enable you to work through each group one by one and make your decisions.
Once you have made a decision on all groups, you'll see the groups have moved to "Solved Groups" with a status reason.
When you are merging records, you'll see a slightly different set of buttons at the bottom of the screen which looks like this.
Here you can see how many groups you have remaining and you can either commit the merge or decide that those records should not be duplicates. If you decide that they should not be duplicates and you click skip, you'll be given the option to "Would you also like to create all exclusion rules for this group?" - this means that if you ran this job again, would you like to be told about these same records. If you click "OK", you will not be told about these potential duplicates moving forward.