Home » Uncategorized » Good data, bad data

Good data, bad data

At my current employer, I have the opportunity to interact with the database that contains all of the records – names, emails, phone numbers, financial gifts, volunteering, groups, events, family history, and attendance – for all of the people that come our way.

This past week, I was registering a person for a class that is coming up in September.  As I started searching for her record, I realized there actually 3 records for the same person.  All 3 had the same email address, but with varying pieces of other information (one had a mailing address, two had an age, etc).  One record had a child associated with it (Jane Doe).  Once I combined all 3 records, merging them into one, I wanted to add Jane Doe back in to the household.

I then discovered that there were 2 records for Jane Doe, age 10, plus another record for Jane Smith, age 10 and same address, plus a third record for Jane Jones, different age, same address.  In addition to that there were:  Julie Doe, Julie Smith, and 2 records for Julie Jones, all with same addresses, emails, ages, or combinations that linked back to the original parent.  Plus, I found John Doe and 3 records for John Smith – not originally connected to the parent, but same address as some of the Jane records and Julie records.

As I see it, there were 12 different records for 3 kids.  This is certainly a case of data being entered without verification – caused by either the parent not letting the organization know that their personal information had changed or data entry being done without checking for already existing same or similar records.

In this case, it is only 12 records, but I can imagine that over time, this situation can cause some database bloat – records that are added in for one reason or another, but not cross-checked to see if any information already exists.  These 12 records for 3 kids can make it hard when you want to register them for an event – is it this one?  Or this one?

I think my point in mentioning this here is that the front side of a database, the place where data entry happens, is important, maybe almost as important as the back side of the database, the place where rows and columns and data types all come together.

I may write/type more on this, but wanted to get the point out there before the week was over.

Thoughts?

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s