Infrequent plot types: gauge, mosaic, and word cloud

Gauge charts typically show up in dashboards. We spotted a gauge chart on Berwin Leighton ArbDiversity 2016 [pg. 16]. All it really tells the reader is that 93% of the respondents believe arbitrator expertise is important. One downside of gauge charts is that they while they stress an important piece of data, they leave out quite a bit more data. Bear in mind also that a bar chart can convey the same information. And, since it takes quite a bit more code to produce this seemingly-simple gauge, a practical person would stick with the good ‘ol bar chart.

Pepper Hamilton PrivateFunds 2016 [pg. 7] offers a variant of a mosaic plot. Mosaic charts take data points, convert them into percentages and map them as a boxy bar chart. Some people describe mosaic charts as variable-width stacked column charts. The chart type goes by many other names: marimekko chart, matrix chart, stacked spinogram, spineplot, olympic or submarine chart, Mondrian diagram, or even just mekko chart. The rectangles in it completely fill the space and their volumes are proportionate. This particular specimen more resembles an area plot where the volumes of the rectangles are proportionate to their percentages, but they don’t fill the larger rectangle. As a side observation, it seems odd to tilt the label in the upper left box and odder still to drop down the 5% label.

Clifford Chance Crossborder 2012 [pg. 22] nestles a !word cloud plot in the lower right corner. A word cloud presents text only. The size of each word corresponds to its relative frequency. The configuration and location of the words has no meaning, but generally the largest words — the most frequent — sit toward the middle. Nor does the color scheme convey information. Before you can produce a word cloud you have to do a fair amount of massaging the text, such as dropping unimportant words, lower-casing words, and (often) stemming words.

Typefaces and styles in research reports based on surveys by law firms

Most readers pay no attention to a report’s typeface. Nevertheless, which typefaces and styles to use depend on decisions by someone (or defaults take over). To use terminology more precisely, font refers to the physical embodiment of what sets text (whether it’s a case of metal pieces or a computer file) while typeface refers to the design appearance (the way the results of the font look). A font is what you use, and a typeface is what you see. A typeface may appear in bold, underlined, italicized or many other styles.

As a mini-lab for learning more about these two typographical choices, seven surveys were volunteered: Allen Matkins CommlRE 2018, Allen Overy Innovative 2012, Clifford Chance Crossborder 2012, Dykema Gossett MA 2017, Pepper Hamilton PrivateFunds 2016, HoganLovells MandA 2012, and Norton Rose Lit 2017.

Foxit PhantomPDF software identifies the original fonts used in each of the seven reports as well as their styles, such as italics, bold, and underlined. That tool also lists the fonts, font types, and encoding.

The plot below gathers the data from the seven reports about the typefaces and styles employed. The tallest bar, for example, tells us that the Pepper Hamilton report uses eight typefaces (the bottom, dark segment) and 16 styles (the top, light segment).


The typefaces used by the reports appear in the first table below.  The styles used by the reports number almost 30, although some of them may be the same, such as “Bd” and “Bold,” “It” and “Italic” or “Lt” and “Light.” Here they are in the second table.


All kinds of other characteristics of typefaces and typography could be investigated, including size, color, character spacing, stroke, word spacing, stroke width, horizontal scaling and baseline offset.

Rarely-seen plots in law firm survey reports

Having considered the most common types of plots, we turn to rarely-seen types. In this first set, we will see examples of bump, segment and parliament plots.

Morrison Foerster GCDisruption 2017 [pg. 7] introduces a bump chart (also known as a slope graph, bipartite graph or Tufte chart). A typical bump chart has two columns of data. Lines extend from points on the left column to their counterpart on the right column. The slope of those lines convey the degree of change. Here, only two of the five issues changed position, and this unusual chart type emphasizes the change. Any time there is a change of rankings or position year-over-year, a bump chart might be pressed into service.

Pepper Hamilton PrivateFunds 2016 [7] gives an example of a waffle plot. These are essentially square pie charts, but instead of wedges of a circle, groups are represented by sets of squares. Waffle plots shine for showing parts-to-whole contributions, highlighting the individual points that make up the larger whole. They are problematic when exact percentages are vital.

This waffle chart breaks assets under management (AUM) of the participants into six ranges. It colors the number of rectangles in the plot in proportion to how many respondents are in each range. So, for example, since there is only one block, in the lower right, colored blue and matching the legend of “More than $10bn,” only one percent of the respondents had that amount of assets under management.

A segment plot, such as the one seen in Morrison Foerster GCsup 2017 [pg. 9], conveys ranges of data. In the example, the left end of the segment starts at the percentage who spend “Substantial Time” on an issue and ends on the right at the percentage who regard it as “Very Important.” Thus, at a glance the reader can compare positions and values for all five issues.

Ranges are really bar charts that do not have a single lowest point on an axis; they are “floating” bars that extend perpendicular to levels on the y axis and extend for different amounts (lengths) on the x axis.

Readability of survey reports with the Flesch-Kincaid assessment

We previously looked at the Flesch reading-ease test. A cousin assessment, the “Flesch-Kincaid Grade Level Formula” (Flesch-Kincaid), also calculates a readability score and also expresses it as a U.S. school grade level. It can be thought of as the number of years of education generally required to understand the text. The sentence, “The Australian platypus is seemingly a hybrid of a mammal and reptilian creature” yields an 11.3 grade level as it has 24 syllables and 13 words.

The grade level is calculated with the following formula:

{\displaystyle 0.39\left({\frac {\mbox{total words}}{\mbox{total sentences}}}\right)+11.8\left({\frac {\mbox{total syllables}}{\mbox{total words}}}\right)-15.59}

According to Wikipedia, the different weighting factors for words per sentence and syllables per word in the Flesch reading-ease test and the Flesch-Kincaid Grade Level Formula mean that the two assessment tools are not directly comparable and cannot be converted. The grade level formula of Flesch-Kincaid emphasizes sentence length over word length. Due to the formula’s construction, the score does not have an upper bound.

To the the three British reports considered previously we added two US reports: Eversheds 21stCentury 2008 and Morrison Foerster GCsup 2016. The topic of the two additions is also the legal industry. Basic statistics about the five reports can be seen in the next table.

The Flesch-Kincaid grade level stands at the college sophomore level for the reports of CMS and KL Gates, even higher for Morrison Foerster, at the postgraduate level, and in some PhD program for Allen.

Readability of reports (Flesch reading-ease test)

All survey reports share a characteristic: how understandable is their prose. Many measures exist for assessing readability, including the long-time Flesch reading-ease test (FRET). With FRET, higher scores would indicate a survey report that is easier to read; lower scores would indicate reports that are more difficult to read. The formula for the Flesch reading-ease score test is

According to Wikipedia, “Reader’s Digest magazine has a readability index of about 65, Time magazine scores about 52, … and the Harvard Law Review has a general readability score in the low 30s.  FRET scores correspond to the reader’s school level shown in the first table below.

To investigate FRET scores, we selected three survey reports that each share two important characteristics. First, the topic of the report — the legal industry itself — and second, that three different British law firms produced the reports. The three reports are CMS GCs 2017, Allen Overy Innovative 2012, and Eversheds 21stCentury 2008. The objective was to analyze text about a similar topic written by firms of a similar national and linguistic background.

Because the cover has sparse text and the final page often has mostly contact information, we removed those two pages and then extracted all the text. The table below provides basic information about each report’s text.Thus, from the third table, all of the reports are written at the level of a well-educated reader. The graph that follows shows another aspect of these three reports: the number of words per report page (less the cover and back page). The results might be called a measure of cognitive density: how much text information is crammed into each page. Note that the length of the report does not correspond to how many words are on each page on average.

Variations on table design in law firm survey reports

Carlton Fields CA 2013 [pg. 21] turned to an unusual table design (aside from the shadowed box). It is a simple 2 x 2 that uses a couple of colors and very large numbers. The snippet includes a summary statement at the top and two bullets on the left which complement the table’s data.

From Carlton Fields CA 2013 [pg. 12], its table offers two variations. First, the firm outlined the most important data in red and added in the right margin “> than 50%.” Think of this as a technique to highlight the most significant finding in a table. Additionally, the firm backgrounded the left column in green for the seven rows below the header. Other than that, there are no cell borders or outline to the table.

The unassuming table from Bryan Cave Collective 2007 [pg. 5] was selected for its black background of the header row, the choice of not identifying the leftmost column with a header label, the replay of the question from the survey above the table, and the horizontal divider lines above and below the table.  To give a sense of the table on the page, the snippet includes part of the text in the second column.

CMS Poland 2016 [pg. 22] chose to outline the cells with partial lines and it aligns the data at the top of each partial line. Also, aside from the left column and its longer text, the most common layout of a table makes the remaining columns of equal width. This table contravenes that convention.

Tables and some of their design characteristics

Among other characteristics, tables vary by the numbers of rows and columns, background shading, borders of cells, and column justification. The sample discussed here offers some of the variations.

Presenting its respondent profile, Fulbright Jaworski Lit 2013 [pg. 4] set up a straightforward four-row-by-four-column table. Unlike most tables, this one has a topic statement directly over the header row. The header row applies a crimson background and uses all capital letters. The three rows below have zebra striping with a gradient of color lightest in the middle. The data columns are centered and no cells have border lines.

The next table, from Dykema Gossett MA 2016 [pg. 13], has one fewer column than the Fulbright table but one more row. The top row is not a header as it restates the question asked on the survey. Both the question and the header row show a light blue shading. Another difference is that the data columns are left justified, not centered. Lastly, the table’s unsightly outline and cell borders stand out.

In the following table from King Spalding ClaimsProfs 2016 [pg. 13], the six rows (including the header) are followed by a gap and then a row for averages. Tables can always add marginal figures.

Note that there are no boundary lines around this table, the cell borders are faint, and the numbers are right justified under the headers. To emphasize it, the average row has a darker shaded background.

This table illustrates how hard it is to detect changes in complex data as compared to presenting the data in a chart. The table has the advantage of giving precise numbers, but it takes work to spot the highs and lows in the columns or to perceive trends over the five-year period.

The table from Allen Overy Innovative 2012 [pg. 53] presents only text, not data. The firm chose an unusual color scheme of green against the entire page’s background of gray. Like the King Spalding table just above, this one dispenses with table outlines but unlike it inserts white cell borders.

A final example, from Berwin Leighton ArbDiversity 2016 [pg. 7], has ten rows (six are omitted here) and six columns, the largest array of this sample. As is typical, the table colors the header differently than the rest of the rows, which are zebra-striped green. The data are left justified and no border outlines clutter the table.

Good practices for table design we suggest from this set include:

  1. Don’t outline the table, since borders add nothing but ink.
  2. Center the data in the column, to avoid crowding it against a border.
  3. Use zebra striping with light shading, to distinguish rows
  4. Have no borders around cells, or unobtrusive borders, and no table outline, to focus readers on the content, not the aesthetics.

Unconventional treatments of area plots

Area plots are unusual in research survey reports, but when they do appear they often seduce graphic designers into creating unconventional varieties. Consider this plot from DLA Piper Compliance 2017 [pg. 6]. Its most prominent irregularity is that the circles are not arranged in order of size. This conjures up depictions of the planets of our solar system! Further, the colors chosen are not meaningful, but they are distracting.

The area of each circle is proportional to the percentage of one of the seven job titles. A more customary layout would present the circles in declining size from the left or in ascending size to the right. Shall we call this an imaginative array?

KL Gates GCDisruption 2018 [pg. 8] also makes poor use of the area technique: the three percentages are too similar for the eye to pick up differences from the area of the circles. Worse, the circles are not aligned at the bottom so that readers can better detect differences in their areas. The firm could have opted to present this simple data in prose or with a small table.

It is hard enough for most people to discern differences in the area of similar circles, let alone when the area is represented by an object as unfamiliar as proportional bottles. Nevertheless, Reed Smith Lifesciences 2015 [pg. 10] chose a visualization technique that did exactly that. Furthermore, the percentages at the top are washed out.

Morrison Foerster GCsup 2017 [pg. 10] also chose an area plot even though there is not much difference between the areas of these circles. Plus the dual levels are very complex to understand. Compounding both effects is a very elaborate explanation below the plot.

Area plots and conventional treatments

Those who design the graphics in research survey reports like to mix in some plots that convey their data by the relative size of an object. Size here actually means the area of the object, and the object is typically a circle.  Clifford Chance AsiaMA 2017 [pg. 6] offers a plain vanilla example, just proportional circles, no colors, visible data arranged from largest to smallest. The area of the left circle almost doubles the area of the circle to its right — just as 48.7\% is almost twice 26.4\%.

Pinsent Masons TMT 2016 [pg. 20] also arranges the circles in declining order of area. The labels underneath with extender lines show one technique for including additional information. Unlike Clifford Chance, which left the circles uncolored, Pinsent Masons shaded them grey and switched the font color to white. The design feature, the curvy line on the left in the shape of a reverse “L,” is in the snippet to show the relative size of the area plot and its location at the bottom right of the page (from the location of the page number).

The final example, from DLA Piper RE 2017 [pg. 10], has four less-than-conventional features. Fanciful cityscapes show through in the larger circles like a watermark. Second, the plot unnecessarily outlines the larger circle; you should want readers to focus on the relative sizes of the inner, darker circles. Third, most readers will fix their eyes first on the prominent “1”, “2”, and “3” rather than on the meaningful percentages below them in parentheses — which do not even have a percentage symbol. Finally, the graphic truncates the circles at the bottom, which seems particularly odd since the area of the circles is meant to convey the differences in percentages between the cities.

Co-contributors to surveys

Quite often, law firms team with another entity that has an interest in the research or they retain consultants who have experience in conducting surveys. We will refer to either type of contributor as a co-contributor. Without having exhaustively checked which of the surveys collected so far have a co-contributor, it is useful to list the ones we have identified. A total of 18 firms used co-contributors.

  • Acuris (Mergermarket) — Pinsent Masons Energy 2017
  •  ALM Intelligence — Fish Richardson Cyberbreaches 2015, Goulston Storrs Multifamily 2017, Morrison Foerster GCsup 2017
  • Association of Claims Professionals and Bickmore — King Spalding ClaimsProfs 2016
  • Association of Corporate Counsel (ACC) — Blake Cassels CanadaCLOs 2013
  • Canadian Corporate Counsel Association (CCCA) — Davies Ward Barometer 2011
  • Canadian Corporate Counsel Association (CCCA) and IPSOS — Davies Ward Barometer 2006
  • Deloitte — Norton Rose ESOP 2014
  • EEF — Squire Sanders Manufacturing 2014
  • ELD International and Right Hat — Winston Strawn Risk 2013
  • FinanceAsia — Clifford Chance AsiaMA 2011
  • Forbesinsights — KL Gates GCDisruption 2017
  • FT Remark — Ropes Gray Risk 2017
  • PEF Services and WithumSmith+Brown — Pepper Hamilton PrivateFunds 2016
  • Pricewaterhouse Coopers — King Spalding MedDevices 20016
  • Queen Mary — University of London and School of International Arbitration — White Case Arbitration 2010
  • RSG Consulting — Allen Overy Innovative 2012
  • Rothschild, debtwire, Mergermarket — Clifford Chance Debt 2007
  • The Merger Market Group (Remark) — Paul Hastings China 2013
  • UCLA Anderson Forecast — Allen Matkins CommlRE 2018

Six of the surveys cited above have more than one co-contributor.  Also, two of the firms, Clifford Chance and King Spalding, worked with different co-contributors on different surveys.