What Averages Obscure

Copy of 60-SECOND DATA TIP #8.png

Nonprofits (and everyone else) are addicted to averages. We like to talk about how participants do on average. We might describe how many visitors we have in an average week. But how much are we missing when we focus solely on averages? Short answer: it depends, but it could be a lot. If I only showed you the average sized guy in the picture, would you appreciate the full range of sizes?

To figure out what and how much we are missing, we need to calculate—or better yet show—how spread out our data points are. Understanding the spread gives us an idea of how well the average or the median represents the data. When the spread of values in the data set is large, the average obscures the real picture more than when the spread is small.

Spread measures include range, quartiles, absolute deviation, variance and standard deviation. For more on these measures, check this out. 

A great way to quickly grasp the spread of your data is to make a box plot. A box plot (aka. box and whisker diagram) shows the distribution of data including the minimum, first quartile, median, third quartile, and maximum. The box plots below show the affordability of neighborhoods in five cities. Each red circle represents a zip code area. The gray boxes show where 50 percent of the zip code areas fall on the affordability scale. And the median is where the dark gray meets the light gray. You can see that, in general (i.e according to the median), New York is more affordable than Los Angeles. However, New York has some zip code areas that are much less affordable than the median seems to suggest.

So when looking at your data, don’t just look at averages, also consider the spread.

boxplot2.jpg

See other data tips in this series for more information on how to effectively visualize and make good use of your organization's data.

Image created by Moxilla for Noun Project.