by johnt
9. August 2011 11:08
In the 1970s Littlewoods wanted to ensure they only delivered one catalogue to each of their million plus customer addresses, this was done by printing out all the names and addresses of their shoppers on continuous stationery then physically going through the list to look for duplicates, some had as many as 6 catalogues.
This took months to complete, I know I was one of the checkers, but it saved thousands of pounds. Back in those days you only had two campaigns a year one in spring and another in the autumn so there was time to do this sort of thing.
Today, many commercial CRM and proprietary customer database systems include de-duplication systems, but most of these are very limited and will only detect exact duplicates. Data8’s system employs a number of powerful fuzzy matching techniques to quickly identify duplicates where the name or address is misspelt. Back in the 70’s, the checkers of the data where the fuzzy logic looking for similar names and addresses.
Today it would take a few hours to de-duplicate a million records, back then you could only do the job for a couple of hours before your eyes started going fuzzy.