Millions, millions of dead people are voting! said Trump. Well for starters, they aren’t voting, despite the claims, however they will be in a database.
As a consultant I see data sources that contain all sorts of stuff, and they have one thing in common. They are full of dirty data. In fact when i’m talking to a customer, you have to tell them that it is a common problem, a lot of clients think that the issue is unique to them, it’s not, don’t worry, we know how to handle it. I normally say tell them this:
‘The only time you’ll see a clean database, is when it is empty and no users are entering data into it’
Think about it, there a most likely millions of dead customers in the Amazon customer database, or in any user registering database. Facebook will be a massive virtual cemetery of people over the coming years. Having dirty data in your database can be a problem. During my time consulting for a bank for the Payment Protection Insurance (PPI) Claims and building their management information, we came across the issue of people being contacted about making claim as they had PPI in the past, however they had died, which can (and was) upsetting their remaining relatives.
Every year I get a Electoral Register Form to confirm who are the registered voters living at the address. The rough data latency (the time of updates) could be a year, or even more depending when I move address. Dirty data isn’t the issue, it is normally your process in updating data that is.