Attributes of infographics, standardized, in surveys by law firms

To inquire further into what law firms include in their infographs, we converted four of the elements — numbers, plots words and concepts — into their respective counts divided by the percentage of the page the infograph occupies.  Without that standardization, larger infographs would have larger accounts, but not necessarily more cognitive density per page.

The next four plots array the six infographs we have been working with from the lowest measure on the left to the highest on the right as well as the average of that data in a different color.

A first impression from these plots might be that the infographs do not vary all that much on these four elements. However, the range from the lowest to the highest is around 1-to2 for words and concepts whereas it is 1-to5 for numbers and 1-to-12 for plots.

The quartet of plots relies on numbers of very different magnitude, as in words are much more numerous than concepts. If we  standardize all the values after they have been divided by the page percent (When you standardize values you divide them by the mean.) then the absolute values — as adjusted for the amount of the page the infograph occupies — are transformed to the same scale. The result is the next plot, where each survey’s standardized value for the element is in a separate segment.

What we can conclude from this different perspective is that with words and concepts, all of the infographs have similar profiles (close to 1). On plots, however, two of the surveys are very skinny (Baker McKenzie and McDonald Hopkins). Likewise, on the use of numbers two of them have an abundance, relatively speaking, of numbers (McDonald Hopkins and HoganLovells.

Infographs and quantifying their components in law firm survey reports

Here are the final two infographs in the set that produces our analytics. On the left below, Squire Sanders Retail 2013 [pg. 24] includes a modest infograph, but at least the firm identifies it as such.

Early on its report, HoganLovells CrossBorder 2014 has two pages of infographs.  Below is the relevant portion of the first of those pages [pg. 8].

The data from all the above counting or estimating appears in the table below.

Infographs in survey reports and their components

Two more infographs appear below. Berwin Leighton Risk 2014 [pg. 4] on the left displays a wonderful infograph, immediately below.

Below, from in a press release about its 2017 survey regarding GDPR, Paul Hastings linked to an infograph.

Let us start our more analytic look by specifying some infograph components. We will compare this set of infographs on those components.

  1.  Words. The number of words in the infograph — whether in plots, summaries, headers or otherwise. The R program, used by this blogger, has several packages that can count the number of words on a PDF page, but those packages cannot isolate the infograph portion of a page nor count words in .png files, which are what this book uses.
  2. Numbers. Any figures in the plot [not digits, but figures].
  3. Plots. The number of plots in the infograph, such as bar and column plots or various forms of pie charts or donut charts.
  4. Decorations. Anything that is neither text, number, nor plot. Considered differently, if all the decorations were erased, the infograph would convey the same information. Perhaps to some eyes its attractiveness would diminish but not its information efficiency. The decorations in the HoganLovells infograph (above left) include the three (dark and light) vertical bars, the three light horizontal bars, the six borders above and below the three headers, and the three shadings of the headers.
  5. Concepts. A concept is a single, simple idea. We tallied the number of concepts addressed in each infograph’s substantive portions (excluding headers and introductory material). Here is the top left concept from each of the six, to give readers a sense of concepts: HoganLovells more deals; McDonald Hopkins business conditions improving; Berwin Leighton biggest concern is regulatory issues; Paul Hastings GDPR fines; Squire Sanders value of multi-channel customers; and Baker McKenzie UK leavers.
  6. Page Percentage. Three of the infographs each filled an entire page, but the other three filled only a portion of a page. We visually estimated the percentage of the page taken up, so these are approximations.
  7. Rows and Columns. The unit for rows was the smallest vertical segment of the infograph, which I deemed a row, and then estimated the number of rows from the top of the infograph to the bottom. For instance, Berwin Leighton has four rows while HoganLovells has eight because eight of the smallest segments measured verticall in the left column (not counting the header) would fill the page. Columns are easier to count.

Infographs in survey reports: advantages and disadvantages

Infographs push law-firm survey reports as far as they currently go in terms of data visualization. Only a handful of them have been located, but they are enough to start an analysis.

Baker McKenzie Brexit 2017 [pg. 1] put its entire report into a single page of an infograph, as shown above.

McDonald Hopkins Business 2017 summarized its survey in early 2017 on business confidence. A snippet of the infograph the law firm produced — but did not include in the report itself — appears immediately above.

The list that follows pulls together a number of the reasons a law firm might want to invest in an infograph, and a number of reasons why it might take a pass.


  1. Links pieces of data collected by a survey.
  2. Tells a story.
  3. Helps readers understand a more complicated message than stand-alone plots.
  4. Attractive and eye-catching.
  5.  Trendy and exploits sophisticated software.
  6.  Produces a new asset for the firm to use in multiple ways.


  1. Complicated and mentally fatiguing for readers.
  2. Requires specialists in layout, design and communication, possibly commercial software.
  3. Different than just assembling several plots on a page.
  4. Requires more sophisticated thinking and planning of the message.
  5. May not justify the investment of time and resources.


Density of questions in survey reports by law firms

For the most part, survey reports present approximately one question asked per page of report. The plot that follows shows data on how many questions were asked per report page. The data comes from a random set of reports that made reasonably clear how many questions its survey asked.

Stated differently, for every question asked, reports devote a bit less than a page to present the plot of findings (or table or list) and discuss the findings. Often pages also have quotations and other material. As you consider the plot, bear in mind that every report has at least a cover page and a back page that does not address a question.

The next plot shows how many pages are in each of the reports covered by the preceding plot and how many questions are explicitly discussed. The sweet spot appears to be about 20 questions in 25 pages, which again works out to roughly one question per page after you subtract the questionless cover, table of contents and introductory pages, and back page.


Survey data collected by a law firm at a conference

When a conference brings together a significant number of people who share a common interest, a law firm might have an opportunity to distribute hard-copy questionnaires, collect them, and analyze the results. Alternatively, the law firm could set up a computer with the questionnaire running on it and collect data by that means.

One instance of this method appeared in an article in 2016. At a financial technology conference in London, Linklaters seized the opportunity to collect data from the attendees.  We deduce this from a quote in the article: “While a Linklaters survey at the event found that 48% of respondents thought Brexit was negative for the UK’s fintech sector, only 12% said they were as a result considering relocating to another EU jurisdiction.

Norton Rose Infrastructure 2016 [pg. 1] likewise resulted from its survey at a conference on Australian infrastructure.

Inviting conference attendees to become survey respondents becomes easier if the law firm sponsors the conference or plays a major role. Dentons Mining 2015 [pg. 3], for example, came about from a conference it sponsored. It appears that the law firm ask 10 questions at the conference. We believe this survey was conducted at the conference in part because of the simplicity of the graphics: the report looks to have been produced quickly, the plots created by the survey software and inserted into a PowerPoint deck. Below is one of the plots.

Focus groups and advisors to research surveys by law firms

A number of law firms prepared for their survey projects by discussing the topic, questions and selections for multiple-choice questions with various people. Most commonly, firms held focus groups. As one example, White Case Arbitration 2010 [pg. 34] explains that “An external focus group comprised of senior corporate counsel, external counsel and academics provided comments on the draft questionnaire.” As a second example, Davies Ward Barometer 2010 [pg. 5] and the Canadian Corporate Counsel Association (CCCA) drew guidance from several preparatory focus groups.

Proskauer Rose Empl 2016 [pg. 3] formed an advisory committee of seven distinguished in-house lawyers to weigh in on its survey initiative. The firm’s co-contributor, Acritas, also tapped into lawyers and in-house alumni of the firm.

Many other people contribute to the success of a research project, but rarely are those who toil in the trenches acknowledged by law firms in their published reports.

Surveys by law firms during recent years, and evidence thereof

How many surveys have been conducted so far? No one knows the complete tally, but the plot below shows at the midpoint of 2018 and for the preceding four years the 205 survey reports or announcements we have located. The number in each column tells how many surveys have been found for that period. We refer to “announcements” because out of the 358 surveys identified to date, about three quarters resulted in a published report in PDF; the remainder we know about only from press releases, articles, or other references.

As of this writing, we know for about some 77 surveys because a press release or an article refers to it. We do not have PDF report, however, although it is possible that they exist. We know of the remaining 47 “missing” surveys because a survey in a series refers to other incarnations. Eventually, these “missing” surveys may become evidenced by a press release or other document or we may locate a PDF report.

At this point, firms based in the United States account for 141 of all the surveys found during the period while UK-based firms account for 111. The third most frequent surveyors are VereinCLG firms, with 65 surveys, followed by Canadian firms with 17.

Readability measures for law firm surveys

Let’s consider a few more readability measures.

  1. The Bormuth Readability Index (BRI) calculates a reading grade level required to read a text based on 1) character count (average length of characters) rather than syllable count and 2) average number of familiar words in the sample text. The BRI uses the Dale-Chall word list to count familiar words in samples of text. The BRI translates to a U.S. grade level. For example, a result of 10.6 means students in 10th grade and above can read and comprehend the text.
  2. The Danielson and Bryan formula is most concerned with the variables of the letters themselves. It uses the number of characters per space and how many characters are in a sentence. From the \textsf{R} package koRpus: DB_1 = ≤ft(1.0364 \times \frac{C}{Bl} \right) + ≤ft( 0.0194 \times \frac{C}{St} \right) – 0.6059  DB_2 = 131.059 – ≤ft( 10.364 \times \frac{C}{Bl} \right) – ≤ft( 0.194 \times \frac{C}{St} \right)  Where Bl means blanks between words, which is not really counted in this implementation, but estimated by words – 1. C is interpreted as literally all characters.
  3.  The Degrees of Reading Power (DRP) test purportedly measures reading ability in terms of the “hardest text that can be read with comprehension”. Grades 6-8 can read and comprehend text with a DRP of 57-67, Grades 9-10 can handle DRPs of 62-72, Grades 11-12 can handle 67-74; and college graduates and above can handle above 70 DRP. Uses the Bormuth Mean Cloze Score (MC): DRP = (1 – B_{MC}) \times 100. This formula itself has no parameters.
  4. Fang’s Easy Listening Formula (ELF) focuses on the proportion of polysyllabic words in a text. ELF is calculated by counting the number of syllables in a sentence and the number of words. ELF = S – W, where S and W are the number of syllables and words in a sentence respectively. This formula punishes every extra syllable.

Readability measures and surveys by law firms

Many other readability measurements have been devised. The plot below shows the Automatic Readability Index (ARI), Coleman-Liau Index, and the Simple Measure of Gobbledygook (SMOG) applied to six reports by U.S. law firms. First, we will briefly explain the three measures.

The Automated Readability Index (ARI) assesses the grade level needed to comprehend the text. For example, if the ARI outputs the number 10, this equates to an assessment that a high school student in the tenth grade of schooling, ages 15-16 years, should be able to comprehend the text. The formula to calculate the Automated Readability Index is 4.71(characters/words) + 0.5(words/sentences) – 21.43.

The Coleman-Liau Index looks at the average number of letters per 100 words (L), and the average number of sentences per 100 words (S).  The formula to calculate the Coleman-Liau Index is 0.0588L – 0.296S – 15.8.  This translates to a grade, so that, for example, a 10.6 means roughly appropriate for a 10-11th grade high school student.

The Simple Measure of Gobbledygook (SMOG) is based on word length and sentence length being multiplied rather than added, as in other readability formulas. The SMOG formula correlates 0.88 with comprehension as measured by reading tests. The SMOG formula is SMOG grading = 3 + the square root of the polysyllable count, where polysyllable count = number of words of more than two syllables in a sample of 30 sentences. This next table translates the higher levels of SMOG to an approximate grade level.

The Foley Lardner report has a much higher total score than the other five reports as its estimated grade level is the highest on all three measures. The Dykema Gossett report, by contrast, aims for a less sophisticated audience.