Today, I am going to talk about a tool that I have just created. The tool is here for discussion, but NOT for re-use or re-distribution. Please use your conscience and do not re-use this tool without my permission, (which can be had for the asking).
The tool is called the Data Journey Diagram.
What
Graphical illustration to help users understand the usage of data generated by them, and to help BI designers see the most expected fallacies in their data.
For
Primarily, the tool is for the end users who generate data and the Analytics consultants who design the analytics solution for an organisation. The other stakeholders are the IT department, legal, security and line businesses.
How
The tool is very simple. We put the end users, the MIS team, and the data analytics team in one room.
Then we create a meandering river shape on the board. The Analytics Consultant puts a simple report at the end of the line. Then, he puts in all the data elements that go into making that report.
Slowly, each stakeholder who owns each data point, writes down where each data element comes from – all the processing points, and all the hands that the data element passes through. The river is used as a rough measure of time flow.
For each data element that comes from a different source, at the point of convergence, we draw a separate line to indicate a contributing tributary. Then the journey carries on backwards until we have a river system in place – indicating all the places where the data comes from, how it all comes together et al.
Ground Rules
At each stage, ONLY the stakeholder who owns the data at that point, will be allowed to create a “milestone rock” and name it. No one else.
Limitations
This tool does NOT:
A. Indicate dependencies.
B. Indicate And/or relationships
C. Indicate mathematical or logical operations. We know that 2 data elements come together, but we do not know if they use averages or totals.
What is achieved
At the end of this exercise, end users are able to appreciate, in very simple terms, how their data errors impact decision making at the CEO level.
The Analysts are able to see sources of the data being used in the high end analytics reports, and can talk to the stakeholders right there to find out if there are any anticipated data errors – time lag, fallacy, human error etc. These are VERY important inputs for the analytics design team, because GIGO is as true today as it was when computers were invented.
So, what are your thoughts?
No comments:
Post a Comment
Please share thoughts