DATA VISUALISATION TIPS: High Data-Ink Ratio, Chart Junk and how to best display your data

Parvina Pasilova

--

Being able to deliver a message in a clear, concise and simple form is art itself. As a data scientist, we are creating insightful stories through data. These data strories usually have audience who is looking to comprehend our stories and make sense of it. We do this with the help of visual encodings.

Visual encodings help us to diplay our data to our audience. It is mapping from our data to display elements. Display elements include position in the x axis or y axis, the size of points or bars, shape of points, texture, angle, length. So the question is how to take our data and best display it to our audience by using these elements?

One solution for that problem is to create data visualizations that humans understand best.

Researchers have found that humans are able to best understand data encoded with positional changes (differences in x- and y- position as we see with scatterplots) and length changes(differences in box heights as we see with bar charts and histograms).

Scatterplot example

Alternatively, humans struggle with understanding data encoded with color hue changes (as are unfortunately commonly used as an additional variable encoding in scatter plots) and area changes (as we see in pie charts, which often makes them not the best plot choice).

Example of a bad data visualisation in pie chart

So the conclusion from here, humans easily comprehend data displayed through scatterplots, bar charts and histograms but not data encoded in pie charts and color hue changes.

Data Visualization tips

Before we dive into how to create better data visualisations lets take a look at examples of BAD Data visualisations.

Below we can find the example of a messy, complicated and just a worst data vizualization that could have existed that disrstracts the audience from conveying its message. Some of the problems associated with the data below are the unclear title name, not following the color theory, messy labels , inconsistency in data representation and etc. This graph below makes it really hard to understand what message or story this data is trying to convey.

And now lets take a look at the examaple of good data visualization:

We can observe this graph is very simple and easy to understand. It follows the color theory (“that says we should not use more than one or two colours in our visualisations unless it’s encoded some kind of value.”), omits all kind of patterns, simple title that helps the audience understand what they are going to learn out of this data. It has simple labels and uses the HIGH DATA-INK RATIO. The high data ink ratio suggests the more of the ink in your visual that is related to conveying the message in the data, the better.

The same data the we saw earlier can be displayed in a much simpler way that can help our audience understand our story and does not distract from the main findings of our data.

Good and Bad plots

Lets take a look at another bad and good visuals. These two plots below are of the exact same data that depict information regarding flights of the USA’s Space Shuttle program.

Whether or not a mechanical failure of O-Ring components occurred, as well as the temperature at the time of flight.

And here is a small quiz for you which visual best represents the given data? First plot or the second?

Chart Junk

As we saw from the above example the “less is more”, or “simpler the better” . We shortly discussed some of the examples how to better display our data and what should be inlcuded to do so. It is also important to note what we should leave out from our visual to best display our data. This notion is referred to Chart Junk.

From Wikipedia, Chart junk refers to all visual elements in charts and graphs that are not necessary to comprehend the information represented on the graph or that distract the viewer from this information.

Examples of Chart Junks:

  1. Heavy grid lines
  2. Unnecessary text
  3. Pictures surrounding the visual
  4. Shading or 3d components
  5. Ornamented chart axes

Now lets take a look at the Chart Junk example:

(Source: wikipedia)

An example of a chart containing gratuitous chartjunk. This chart uses a large area and much “ink” (many symbols and lines) to show only five hard-to-read numbers, 1, 2, 4, 8, and 16.

It is really painful to look at this visual and it almost hurts my eyes everytime I look at it. Please never ever make this type of data visuals.

The data-ink ratio, credited to Edward Tufte, is directly related to the idea of chart junk. The more of the ink in your visual that is related to conveying the message in the data, the better.

Limiting chart junk increases the data-ink ratio.

Summary

Overall, this short blog gave some insights how can we best display our data, avoid some of the common mistakes made while creating data visuals. I hope this will help you to create better visuals and deliver your stories in a simple and clear manner.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

No responses yet

Write a response