This may seem an odd question: surely it just means turning data into a visually accessible form such as charts and diagrams? However, people have been drawing charts for hundreds of years and Edward Tufte published the seminal book “The Visual Display of Quantitative Information” way back in 1983; so why has the term ‘data visualisation’ become so important now? Has something changed and if so, what?
The answer, is that computing has changed the data landscape completely, particularly in the last decade, impacting both the collection and the rendering of data. In the past, collecting, storing and analysing data was an expensive and time consuming process. This situation has been transformed: technology allows for the automated collection of large amounts of data at little or no cost, via server monitoring; website tracking; and automated sensors. At the same time, the growth of computing power and sophisticated software has made the analysis of large data sets feasible and allowed analysis and presentation to be conducted in real time.
In the process, the nature and purpose of data visualisation has split into several distinct areas that deserve to be understood independently. In other words, ‘data visualisation’ is not one topic, but several, each with specific characteristics.
The most basic form of data visualisation is the oldest: the charting of data in order to communicate ideas. At its heart, the aim is to express an idea using data, in order to persuade an audience.
The key to doing it well is in the selection of the right type of charts; the simple and effective design of those charts (as covered extensively by Tufte); and good storytelling to hold it all together.
Advances in technology have made it easier for people to produce charts, but had little other impact.
The second type of data visualisation relies on the power of computing to collect and analyse data in real time. Rather than charts being static representations, these ‘dashboards’ update live as new data arrives, allowing users to actively monitor the situation. While real-time updates are an innovation, the actual charts being used are often very traditional: just line charts, histograms, bar charts and pie charts.
Real-time data visualisation is hugely valuable in a range of situations like medical monitoring; call centre activity tracking; server monitoring; and power station control rooms. In these cases, people can use the charts to rapidly spot when things are going wrong and apply corrective actions.
However, in many cases, real-time visualisation is a pointless distraction. This is because, even if new data arrives regularly, the charts being displayed only change very slowly; because taking any corrective action will take significant time; and because the results of any actions will only be visible in the charts days or weeks later. This applies to almost all situations involving business operating data.
Think of a dashboard showing quarterly sales for the last 6 quarters, it might be possible to update it in real time as every new sale is made, but in practice each new sale can only affect the last bar and its size will only change minutely from one day to the next. If sales this quarter are poor, you might need to take action, but this will take days or weeks to execute and the result will also take days or weeks to show up in the charts.
In many cases, rather than offering users a real-time dashboard, it would be more useful for the data to be properly analysed on a weekly or monthly basis and the real trends communicated to stakeholders in a more traditional way.
Data Visualisation as Analysis
Where new technology has really had an impact is in a third category – the use of data visualisation as an analytical tool in its own right. It is this area that is really driving the popularity of the term.
In the past, analysis was almost exclusively done by taking data and running mathematical operations using Excel or a statistical package. The analysis to be done was defined in advance and the results were turned into a visual format only at the end of the process. In other words, analysis was used to confirm or disprove an existing hypothesis; and visualisation was used only to communicate the outcome.
The advent of powerful visualisation software like Tableau has changed this. Instead of running analyses and then plotting simple charts, it is possible to visualise large volumes of data in complex multi-dimensional charts and look for patterns that emerge. In other words, to use the process of visualisation itself as an analytical tool, exploring the data, finding patterns and then exploring them further in an iterative process.
The reason that this works is that humans have a fantastic ability to spot patterns visually. We can pick out clumps, lines and other shapes that are formed by data when they are presented visually, even though such patterns would be completely inaccessible when viewing the raw numbers.
So what is data visualisation? At one level it is just the making of charts to show data. However, it is the use of visualisation as an analytical tool which is really transformational and this is rather lost when we use a single term to refer to all types of chart making.