Updated: Sep 10, 2018
"The growing challenge of big data has led to an exponential growth in the number of people using data visualization techniques to make sense of the data."
The growing challenge of big data has led to an exponential growth in the number of people using data visualization techniques to make sense of the data. Recently, I came across an article in Gizmodo, on a collection of bad visualizations. In an effort to produce attractive infographics and diagrammatic representations of data, such data visualizations seem to be putting at risk, the belief in the value that they bring to big data analytics. The idea of analyzing different kinds of data to gain insight has been around for quite a while now and the logical conclusion would follow that the same holds good for data visualization. It is important to understand the goals of data visualization in order to deliver efficiently its potential value.
While the phrase data visualization might seem self-explanatory, I would like to stress that data visualization serves purposes greater than that which is implied eponymously. Classifying these under two broad categories, good data visualizations serve informatory needs and exploratory ambitions.
Informatory visualizations serve the purpose of reporting where one may measure some underlying drivers for example, customers, prospects, competitors, market opportunity etc over a period of time to identify how the enterprise is aligned. In other words, this helps in visualizing the what and when of the information, thereby conveying complex data in a visually engaging manner.
The exploratory visualizations, on the other hand, help to understand the how and why of the information. These visualizations help identify relationships, correlations, patterns and models in data that were previously unknown. They serve an investigative purpose that will help answer why a particular situation has occurred, predict the risks involved in taking measures to realign to a pre-defined path or go down a new path. By facilitating interaction and engagement through deeper visual drill-downs, it is possible to identify the threads that weave the data tapestry together.
Efficient data visualization is predicated on understanding the drivers that the data underpins. This requires knowledge of the context in which the data was collected and the audience that it is intended for. It is also important to ensure that the data quality and integrity are of acceptable standards before attempting to visualize it and draw meaning. Garbage in will only result in garbage out. Furthermore, while using narratives to illustrate a trend in data is good practice in informatory visualizations, they should be objective in nature so as to not influence the users interpretation. Jim Stikeleather in his article in the Harvard Business Review explains how comprehension by the user is heavily dependent on the semantics of the visualization. Designer bias that includes choice of colors, design elements, chart types, 2D or 3D effects influence the users interpretation and care must be taken to ensure that such features, which are independent of the data, do not compromise the story the data actually tells. Edward Tufte created a formula to quantify a lie factor to show how misleading a graph could be under the influence of some these biases. It is calculated by dividing the size of the effect shown in the graphic by the size of the effect in the data.
Ultimately good data visualisation in the big data era should ensure that the story it presents enables insight driven action. Data visualisations that are created by adopting a clear design philosophy that incorporates the above guidelines will certainly succeed in providing insightful informatory and exploratory data visualisations that encourage action, thereby translating into greater return on investment.