Mosaic plots
a part-to-whole visualization

We frequently run into questions that are best answered by visualizations of part-to-whole relationships: total revenue by country or product, total incidents by turbine, genome read variants by chromosome, &c.

The most well known graphic depicting part-to-whole relationships is the pie chart. Unfortunately, pie charts cannot be accurately deciphered by the human perceptual system, so we cannot reccomend it to our current clients. (For an excellent primer on why to avoid pie charts, see Stephen Few’s article.)

As an alternative to pies, we typically employ a trusty bar chart:

Proportion of total revenue by country

0%
20%
40%
60%

This chart emphasizes that it’s a part-to-whole comparison via the title (“Proportion of…”) and by the use of percentages on the axis rather than absolute values (e.g., dollars).

However, bar charts do not immediately signal “part-to-whole comparisons ahead!” as loudly as pie charts do — in a quick glance at the chart above, one could easily fail to realize that the percentages sum to 100% and assume that the graph depicts some other proportion (e.g., percentage of citizens who had a beer at lunch today).

This article introduces mosaic plots, a visualization that illustrates parts-to-whole relationships using area. While the visual comparisons within mosaic plots (area to area) are not as robust as those within bar charts (length along a common baseline), mosaic plots are useful in situations where:

I originally came across formal discussion of mosaic plots in 2011 when my friend Hadley Wickham published a paper describing “product plots”, a framework that encompasses mosaic plots, bar charts, treemaps, and other area-based visualizations.

We’ll first cover the (light) statistical theory behind the operation of mosaic plots, and then cover the particularly straightforward implementation of attractive, responsive mosaic plots on modern browsers via CSS flexbox.

Statistical distributions 101

Mosaic plots excel at depicting part-to-whole relationships. From a whole dataset, there tend to be many dimensions across which to split the data into parts.

Consider surveying your closest 10,000 friends: What’s the highest education level they’ve received? Have they been married? Are they in good health? Are they happy? (This article’s data is the same data used by Hadley in his original paper, from the General Social Survey — as provided by Hadley on Github.)

We can break the whole set of responses into parts, starting with a simple question: What proportion of people are happy?

response count
not too happy 5,629
pretty happy 25,874
very happy 14,800

Absolute counts aren’t illuminating much here (beyond the fact that the GSS is a big survey), so lets switch to proportions:

response proportion
not too happy 0.12
pretty happy 0.56
very happy 0.32

This is the marginal distribution of happiness: f(happy). I’m happy to notice that the majority of people are either “pretty happy” or “very happy”!

Lets look at these happiness data together with biological sex. In the joint distribution, f(happy, sex), each cell shows the proportion of the entire survey corresponding to those values:

male female
not too happy 0.05 0.07
pretty happy 0.25 0.31
very happy 0.14 0.18

So, for example, 5% of respondents are not too happy males. (Note that all of the cells sum to 1.) This display allows you to make observations like: “There were twice as many very happy males as not too happy females”.

Summing the values along a row or column eliminates the variable on the other dimension. For example, adding all of the values on the first row essentially means, “I don’t care about sex; I just want to know what proportion of people are not too happy”. Doing this on all of the rows and columns yields the marginals:

male female
not too happy 0.12
pretty happy 0.56
very happy 0.32
0.44 0.56

The final kind of distribution we’ll see are the conditionals. The conditional distribution f(happy | sex),

not too happypretty happyvery happy
male0.120.570.31
female0.120.550.32

answers questions like: “What is the chance that I’m very happy, given that I’m female?” Note that the rows sum to 1.

We can also flip the variables and examine f(sex | happy),

malefemale
not too happy0.430.57
pretty happy0.450.55
very happy0.430.57

which answers questions like: “What is the chance that I’m female, given that I’m pretty happy?”

(This question may sound absurd; how is “chance” involved in an adult’s biological sex? A helpful way of thinking about chance is not as a property of the universe but rather as a property of your knowledge base. When we say that a coin has “50/50 chance”, what we really mean is that we don’t have the right measuring equipment — with a high-speed camera and computer on hand, we could reliably predict tosses of the same “50/50 chance” coin.)

These three types of distributions — joint, marginal, and conditional — support different kinds of questions about our data. Thus far, we’ve depicted each distribution as a plain ol’ table. Next we’ll see how to depict these distributions visually.

Mosaic plots

The key idea of mosaic plots is that we can map the proportions of a distribution to the areas of a graphic. For example, take the marginal distribution of happiness, f(happy):

not too happy0.12
pretty happy0.56
very happy0.32

The same approach also works for conditional distributions like f(sex | happy):

malefemale
not too happy0.430.57
pretty happy0.450.55
very happy0.430.57

where the graphic has been evenly partitioned into three vertical spines, one for each level of the categorical variable “happiness”. Each vertical spine is then divided into two horizontal spines, corresponding to “male” and “female”.

(Note: horizontal and vertical refers to the direction the spines expand, not to their long axis, which depends on the aspect ratio of the mosaic plot.)

From these two mosaic plots we can visually illustrate the mathematical fact f(happy) × f(sex | happy) = f(sex, happy):

×
=

Each one of the six disjoint segments of the rightmost mosaic plot has area proportional to the corresponding joint probability:

male female
not too happy 0.05 0.07
pretty happy 0.25 0.31
very happy 0.14 0.18

On a mosaic plot the marginals can be quickly estimated by looking at a single row or color. In contrast, these same data would require six bars on a bar chart, and one would need to locate and mentally “stack” bars together to make the same comparisons.

Flexbox-based implementation

Mosaic plots are a disjoint partitioning of a rectangular area. Within a mosaic plot, each rectangular sub-area expands horizontally or vertically according to its weight relative to its siblings. This is the exact problem that CSS3’s flexbox specification solves. (Flexbox is currently a “last call working draft”, but it’s already supported by 88% of browsers.)

Consider the mosaic plot of marginal distribution f(happy):

This rectangle has been split into three horizontal spines, the width of each corresponding to the relative weight of that level. This graphic was created with this HTML markup:

<div class="mosaic-plot spines">
  <div style="flex:  5629;" data-happy="not too happy"></div>
  <div style="flex: 25874;" data-happy="pretty happy"></div>
  <div style="flex: 14800;" data-happy="very happy"></div>
</div>

The inline style flex value is simply that level’s marginal (e.g., 5629 respondents reported being “not too happy”). The flexbox layout engine takes care of all the scaling for us.

The required styling (in SASS notation) is minimal:

.mosaic-plot
  display: flex
  height: 200px
  width: 200px

.spines > div
  display: flex
  position: relative
  align-items: stretch

Switching to vertical spines rather than horizontal ones:

requires a single change in the container class from spines to vspines and the corresponding styles:

.vspines
  flex-direction: column

.vspines > div
  display: flex
  position: relative
  align-items: stretch

Rather than using classes or inline styles to color each rectangle, I’m using data-happy attributes:

<div style="flex: 5629;" data-happy="not too happy"></div>

These have two advantages over class-based selectors:

  1. data attributes admit the exact level name from the original data — no need to convert spaces to underscores or dashes to make a valid class name
  2. data attributes enforce the disjoint semantics of our distribution: an element can have multiple data- attributes corresponding to multiple data dimensions, but there is no way to accidentally mark an element as corresponding to multiple levels of the same dimension.

These attributes can be selected for to add the background colors:

[data-happy="not too happy"]
  background-color: rgb(255, 131, 73)

[data-happy="pretty happy"]
  background-color: rgb(253, 255, 185)

[data-happy="very happy"]
  background-color: rgb(102, 183, 73)

Because the spines themselves have the display: flex property, they can be nested. The joint distribution f(sex, happy):

consists of vertical spines nested within horizontal spines:

<div class="mosaic-plot spines">
   <div style="flex:5629;" class="vspines" data-happy="not too happy">
     <div style="flex:2424;" data-sex="male"></div>
     <div style="flex:3205;" data-sex="female"></div>
   </div>

   <div style="flex:25874;" class="vspines" data-happy="pretty happy">
     <div style="flex:11555;" data-sex="male"></div>
     <div style="flex:14319;" data-sex="female"></div>
   </div>

   <div style="flex:14800;" class="vspines" data-happy="very happy">
     <div style="flex:6378;" data-sex="male"></div>
     <div style="flex:8422;" data-sex="female"></div>
   </div>
 </div>

(Though notice that we have to assign flex on both the inner vertical spines and the parent horizontal spines.)

Finally, note that

.mosaic-plot
  display: flex
  height: 200px
  width: 200px

is the only appearance of a CSS length unit in any of the styles we’ve seen. Explicit lengths are not required by mosaic plots, making them completely responsive: The flexbox engine will automatically scale the widths and heights to fit their container, no JavaScript required!

Labels and other display considerations

As with most statistical graphics, mosaic plots are useless without proper labeling. A color legend:

not too happy
pretty happy
very happy

consists of straightforward markup:

<table class="legend">
  <tr><td>
    <span class="color-box" data-happy="not too happy"></span>
    not too happy
  </td></tr>
  <tr><td>
    <span class="color-box" data-happy="pretty happy"></span>
    pretty happy
  </td></tr>
  <tr><td>
    <span class="color-box" data-happy="very happy"></span>
    very happy
  </td></tr>
</table>

and styles:

table.legend
  width: 10em
  margin-left: 4em

span.color-box
  display: inline-block
  width: 1em
  height: 1em
  margin: 0 0.2em
  vertical-align: middle
  border: 1px solid gray

with the colors assigned by the same data-happy selector that colors the mosaic plot.

In this case, the legend itself is gratuitous — we can make a graphic that is both more concise and more readable:

with markup:

<div class="mosaic-plot vspines">
  <div style="flex:5629" data-happy="not too happy">
    <label class="left">not too happy</label>
  </div>

  <div style="flex:25874" data-happy="pretty happy">
    <label class="left">pretty happy</label>
  </div>

  <div style="flex:14800" data-happy="very happy">
    <label class="left">very happy</label>
  </div>
</div>

and with styles:

label
  width: 100%
  height: 100%
  &.left
    text-align: right
    padding-right: 1em
    position: absolute
    transform: translate(-100%, 0)

The transform CSS property (90.6% browser support, with vendor prefixes) allows us to keep the labels vertically aligned with their parent spine, but shift them outside of mosaic plot.

In this particular graphic, we also have enough vertical space to keep the labels inside the spines:

label.within
  text-align: center
  color: black
  position: absolute

In both cases, the labels are positioned in part according to the data, so care must be taken to make sure the graphic is large enough to prevent the labels from colliding.

The final display consideration is the ordering of categorical variables. A variable’s levels within a mosaic plot can be placed under a partial ordering via flexbox’s order property. This property overrides the markup order, making it particularly easy to customize ordering to call attention to certain facets of the data.

If the variable is truly categorical — there is no natural ordering of the levels — then a good choice is to order by proportion, with the either the smallest or largest value coming first. (Alphabetical ordering supports fast lookup, but if that’s the primary use case then you should use an exact table rather than a visualization.)

In our survey of happiness, we really have an ordinal variable: The levels are naturally ordered from least to most happy, which we can enforce with these styles:

[data-happy="not too happy"]
  background-color: rgb(255, 131, 73)
  order: 1

[data-happy="pretty happy"]
  background-color: rgb(253, 255, 185)
  order: 2

[data-happy="very happy"]
  background-color: rgb(102, 183, 73)
  order: 3

Conclusion

People have a natural tendency to compare shapes by area, and we can leverage this tendency to depict statistical distributions via mosaic plots. Mosaic plots can be implemented easily on the web via CSS flexbox, which handles the necessary scaling calculations. Components of a mosaic plot can be ordered and colored entirely via CSS, allowing these rich, responsive statistical graphics to be built without JavaScript.

Mosaic plots are an excellent alternative to bar charts in situations where part-to-whole relationships should be emphasized or where physical space is limited. Although mosaic plots can be drawn “recursively” to depict joint distributions, such graphics quickly become incomprehensible — it’s best to use mosaic plots to display simpler marginal and conditional distributions. (If joint distributions must be shown, your best bet is to draw several complementary mosaic plots and bar charts — Stephen Few discusses this topic at length: Are mosaic plots worthwhile?)

For more details on mosaic plots (including their relationship with bar charts), see Hadley’s original paper.

If you’d like help implementing mosaic plots or designing analytics systems for your business, shoot us an email: hello@keminglabs.com.

Thanks

Thanks to Ryan Lucas for suggesting additional context/motivation. Thanks to Dan Luu for suggesting a clearer, parallel sentence structure and transposing one of the conditional tables. Thanks to Nicki Vance for suggesting smoother section transitions and linking tables+plots. Thanks to Hadley Wickham for suggesting exposition on the terms mosaic plots, bar plots, and product plots; also for discovering some CSS issues in Safari.