Colour Blindness & Data Visualisation

In my background of administration, I've dabbled and experimented with formatting documents. I can happily sit at my desk for hours making word or excel documentation look concise and readable, and I always have so much fun making things "pretty" and appealing to the eye. Whether it be colour schemes and gradiation, fonts or even paragraph spacing, it all came naturally to me.

When I started my Data Analytics course, I knew for sure that the visualisation aspect of it would pique my interest, and boy was I right. Learning Matplotlib has definitely been one of my favourite topics so far. The way python can make incredibly important graphs and figures seemingly out of thin air blew me away. Not to mention how incredibly powerful it is when combined with Pandas!

But with it's power comes great responsibility. 

There are a lot of foundational basics regarding the Best Practices and Common Mistakes to matplotlib that are extremely important to understand. Essentially, if abused or not implemented correctly, these customisable options can not only skew data, but make things unreadable to your audience, and even worse, your employer! One such facet of these many important choices, is colour. 

A common example of a Data Visualisation "Don't", is having different values set to a very similar (or the same) colour.

As you can see, colour is one of the main ways to help the audience understand the difference between values within a figure. If the colours are too close to each other, the data we are visualising becomes clouded and confusing. 

Setting the 'color' of x and y to colours that are more different can make a huge difference by improving readability:


Now this figure is more accessable, right? Well, not quite. Let's have a look at what this same graph looks like to someone who is Colour Blind:

Looks like we're back at square one. 
(FYI, this is a type of colour blindness called Blue-Blind/Tritanopia). 

A quick Google and we can find there are many colour combinations that do not perform well for someone who is colour blind:


It's a lot. More than I expected, that's for sure!

Now, through the process of elimination, and some experimentation with a colour scheme generator (which I have linked below), I was able to construct a pallette that I found appealing to the eye. Not only that, but it's also suitable for almost all types of colour blindness, everything except complete Monochrome!


Here it is under some various colour blind conditions, in the order of Blue-Blind/Tritanopia, Red-Weak/Protanomaly and Green-Blind/Deuteranopia:
        


 









And finally, here's that graph we used before, applied with our new colour blind friendly scheme:


Nice! Now we know that it is clear which variable is which, no matter who our audience is. 


Now, it's irrefutable that finding a scheme this way each time you make a figure with matplotlib is far too time comsuming a task.

Thankfully, we have two built-in options we can apply to get around this:
  1. Matplotlib has a large array of built in Stylesheets to apply to our figures, including some that cater specifically to be readable by people who are colour blind. One of these is 'tableau-colorblind10', as seen below:


  2. We can also change the marker of each variable, to make things even more distinct from each other without needing to change the colour. There are heaps of different marker types to use.

Regardless of which route you pick, all of this experimentation showcases the importance of making our Data Visualisation not only readable, but accessable as well.





Below I have linked a few websites that assisted me in my understanding of the topic we just covered, I hope they prove useful to you as well:

Comments