Data visualisation: are you representing your data in the right way?
Updated: Oct 27, 2021
In recent years, data analysis has become more and more important for all companies, and data visualisation has also become more and more central for anyone who wants to adopt a data-driven approach.
In this article we will focus on one of the many aspects of data visualisation: colours.
The correct use of colours is probably one of the most important and often underestimated aspects of data visualization: as a master of fact, a strategic use of colours can make the difference between a good and a bad data visualization.
In data visualization, the use of colour should not be understood as a mere aesthetic choice, but rather as a tool for conveying quantitative information.
The colour selection should always be made with a specific communicative goal in mind, such as attracting the reader's attention, emphasizing a point, or differentiating between two or more categories.
For this reason, although it is tempting to make graphs with great visual appeal, in some cases it may be more appropriate to stick to types of graphs that, although basic, fit our cognitive system better.
In fact, one of the main goals of data visualization is to convey information quickly and effectively as well as to make it easier to remember.
Why is the proper use of colour so important in data visualization?
The right colour will help you to understand the meaning of the information you want to convey, and to emphasize important details. Conversely, the wrong colour will divert attention from key points, and it will make information unclear.
Figure 1 can be used as an example: imagine that you were asked to identify the number 3 from the first table and then and then you were asked to repeat the same exercise using the second one.
Figure 1. Let’s count how many 3s!
In the second table, the red colour and boldface allow you to spot the number 3 very quickly, even before you start reading the first row of numbers.
Sequential, divergent, and categorical colours in a nutshell
Before discussing how to use colours to communicate data, it is important to distinguish between sequential, divergent, and categorical colours.
Sequential colours schemes are single colours whose intensity changes from a very light to a darker shade.
Divergent color schemes are two sequential colors that fade toward a neutral one - often representing the center; typically, these colours are used to show data that deviate from an average quantity.
Categorical colour schemes are sets of distinct colours that are not characterized by any defined order.
Figure 2: examples of divergent, sequential, and categorical colours.
Use sequential or divergent colors to show correlated data
The choice of sequential colours is particularly effective when you want to present correlated data - In this case, colours will be assigned to the data in a continuum based on colour intensity.
For example, you can use a single colour, varying its hue to indicate a range of values. In this case, the intensity of the colour will vary according to the size of the data: the larger the data, the more intense the colour.
The map shown in Figure 3, for example, uses differences in red shading within predefined areas to indicate the level of risk in those areas.
Figure 3. Level of risk of losing a home by state in the United States.
Sometimes, it can be useful to use divergent colour schemes as an additional differentiator. For example, it is particularly effective to contrast warm colours such as yellow or red with cool colours such as blue or green at either end (Figure 4).
Figure 4. The purchasing power of $100 per county in the US.
Use contrasting colours to compare categorical variables
Conversely, when you want to compare or contrast two unrelated metrics, choosing contrasting colours allows you to show the opposition more easily between the two.
Usually, the use of contrasting colour is applied in the case of categorical variables, i.e. variables that take on distinct labels with no intrinsic order. In this case, each value will be assigned a distinct colour.
In Figure 5, for instance, purple, blue, and green were used to distinguish the different generations from each other.
Figure 5. Household debt by generation in the USA.
Sometimes colours can be associated with other visual elements, as in Figure 6, where the size of the bubble indicates the size of the data, while red and pink were used to distinguish between countries and companies.
Figure 6. Comparison of market capitalization of some countries versus some tech companies.
Finally, it is possible to exploit common semantic associations between certain concepts and colours, such as between red and danger, yellow and gold, or green and money.
In Figure 7, for example, blue and red were effectively used to differentiate Democrats and Republicans.
Figure 7. Comparison of Democratic and Republican party costs for presidential elections between 2000 and 2020
Use colors to highlight important information
Colours can also be used to better highlight one piece of data over another. Usually, brighter colours are used to make important data stand out, whether it is a line or dot in a graph or a word in text.
For example, in Figure 8, starting with an orange base colour, yellow was used to show the country taking the highest value (the Philippines), while white was used to show the global average.
Figure 8. Average hours spent per day on the Internet in different countries.
Use colour to highlight atypical data
In data visualization, colours can help quickly identify not only major trends but also anomalies and atypical data, as in Figure 9, where red was used to make negative values stand out.
Figure 9. Performance of the top 10 companies ranked by GDX Index.
Some best practices for good data visualization
We will conclude this article by summarizing some rules that it is good to follow:
Do not use too many colours at once. Generally, it is recommended not to include more than six colours per dashboard or graphic to avoid information being difficult to understand;
Do not choose colours that are not easily distinguishable and choose a background appropriate to the colours you choose;
Keep colour consistency between charts: if colours change their meaning between charts, it can be more difficult to understand their meaning.