I had a cup of coffee with EMC Chief Data Governance Officer Barbara Latulippe recently. We talked about how more and more people tell us they have access to analytical sandboxes attached to a Data Lake but still can’t find the information they need.
Is this a Data Governance problem? A Skill problem? A Technology problem? A Tools problem?
The answer is yes, it’s all of that!
When you build a Data Lake you most likely have structured and unstructured data in it. For this post I’m only going to talk about the structured data because it’s the fastest/easiest to get value from it and a larger audience will benefit.
Biggest Complaint: I can’t find my data!
Reply: “You have everything you need. Why are you complaining?”
So what’s the problem?
Ok, many of us are used to using reporting tools and having nice clean flat tables fed from an EDW/GDW database. Now I have thousands or more tables with very little connection. I blogged about his problem before, likening it to dumping a bag of Legos on your desk and saying “Here you go”.
Keys to Success: