Survey data collected by a law firm at a conference

When a conference brings together a significant number of people who share a common interest, a law firm might have an opportunity to distribute hard-copy questionnaires, collect them, and analyze the results. Alternatively, the law firm could set up a computer with the questionnaire running on it and collect data by that means.

One instance of this method appeared in an article in 2016. At a financial technology conference in London, Linklaters seized the opportunity to collect data from the attendees.  We deduce this from a quote in the article: “While a Linklaters survey at the event found that 48% of respondents thought Brexit was negative for the UK’s fintech sector, only 12% said they were as a result considering relocating to another EU jurisdiction.

Norton Rose Infrastructure 2016 [pg. 1] likewise resulted from its survey at a conference on Australian infrastructure.

Inviting conference attendees to become survey respondents becomes easier if the law firm sponsors the conference or plays a major role. Dentons Mining 2015 [pg. 3], for example, came about from a conference it sponsored. It appears that the law firm ask 10 questions at the conference. We believe this survey was conducted at the conference in part because of the simplicity of the graphics: the report looks to have been produced quickly, the plots created by the survey software and inserted into a PowerPoint deck. Below is one of the plots.

Focus groups and advisors to research surveys by law firms

A number of law firms prepared for their survey projects by discussing the topic, questions and selections for multiple-choice questions with various people. Most commonly, firms held focus groups. As one example, White Case Arbitration 2010 [pg. 34] explains that “An external focus group comprised of senior corporate counsel, external counsel and academics provided comments on the draft questionnaire.” As a second example, Davies Ward Barometer 2010 [pg. 5] and the Canadian Corporate Counsel Association (CCCA) drew guidance from several preparatory focus groups.

Proskauer Rose Empl 2016 [pg. 3] formed an advisory committee of seven distinguished in-house lawyers to weigh in on its survey initiative. The firm’s co-contributor, Acritas, also tapped into lawyers and in-house alumni of the firm.

Many other people contribute to the success of a research project, but rarely are those who toil in the trenches acknowledged by law firms in their published reports.

Surveys by law firms during recent years, and evidence thereof

How many surveys have been conducted so far? No one knows the complete tally, but the plot below shows at the midpoint of 2018 and for the preceding four years the 205 survey reports or announcements we have located. The number in each column tells how many surveys have been found for that period. We refer to “announcements” because out of the 358 surveys identified to date, about three quarters resulted in a published report in PDF; the remainder we know about only from press releases, articles, or other references.

As of this writing, we know for about some 77 surveys because a press release or an article refers to it. We do not have PDF report, however, although it is possible that they exist. We know of the remaining 47 “missing” surveys because a survey in a series refers to other incarnations. Eventually, these “missing” surveys may become evidenced by a press release or other document or we may locate a PDF report.

At this point, firms based in the United States account for 141 of all the surveys found during the period while UK-based firms account for 111. The third most frequent surveyors are VereinCLG firms, with 65 surveys, followed by Canadian firms with 17.

Readability measures for law firm surveys

Let’s consider a few more readability measures.

  1. The Bormuth Readability Index (BRI) calculates a reading grade level required to read a text based on 1) character count (average length of characters) rather than syllable count and 2) average number of familiar words in the sample text. The BRI uses the Dale-Chall word list to count familiar words in samples of text. The BRI translates to a U.S. grade level. For example, a result of 10.6 means students in 10th grade and above can read and comprehend the text.
  2. The Danielson and Bryan formula is most concerned with the variables of the letters themselves. It uses the number of characters per space and how many characters are in a sentence. From the \textsf{R} package koRpus: DB_1 = ≤ft(1.0364 \times \frac{C}{Bl} \right) + ≤ft( 0.0194 \times \frac{C}{St} \right) – 0.6059  DB_2 = 131.059 – ≤ft( 10.364 \times \frac{C}{Bl} \right) – ≤ft( 0.194 \times \frac{C}{St} \right)  Where Bl means blanks between words, which is not really counted in this implementation, but estimated by words – 1. C is interpreted as literally all characters.
  3.  The Degrees of Reading Power (DRP) test purportedly measures reading ability in terms of the “hardest text that can be read with comprehension”. Grades 6-8 can read and comprehend text with a DRP of 57-67, Grades 9-10 can handle DRPs of 62-72, Grades 11-12 can handle 67-74; and college graduates and above can handle above 70 DRP. Uses the Bormuth Mean Cloze Score (MC): DRP = (1 – B_{MC}) \times 100. This formula itself has no parameters.
  4. Fang’s Easy Listening Formula (ELF) focuses on the proportion of polysyllabic words in a text. ELF is calculated by counting the number of syllables in a sentence and the number of words. ELF = S – W, where S and W are the number of syllables and words in a sentence respectively. This formula punishes every extra syllable.

Readability measures and surveys by law firms

Many other readability measurements have been devised. The plot below shows the Automatic Readability Index (ARI), Coleman-Liau Index, and the Simple Measure of Gobbledygook (SMOG) applied to six reports by U.S. law firms. First, we will briefly explain the three measures.

The Automated Readability Index (ARI) assesses the grade level needed to comprehend the text. For example, if the ARI outputs the number 10, this equates to an assessment that a high school student in the tenth grade of schooling, ages 15-16 years, should be able to comprehend the text. The formula to calculate the Automated Readability Index is 4.71(characters/words) + 0.5(words/sentences) – 21.43.

The Coleman-Liau Index looks at the average number of letters per 100 words (L), and the average number of sentences per 100 words (S).  The formula to calculate the Coleman-Liau Index is 0.0588L – 0.296S – 15.8.  This translates to a grade, so that, for example, a 10.6 means roughly appropriate for a 10-11th grade high school student.

The Simple Measure of Gobbledygook (SMOG) is based on word length and sentence length being multiplied rather than added, as in other readability formulas. The SMOG formula correlates 0.88 with comprehension as measured by reading tests. The SMOG formula is SMOG grading = 3 + the square root of the polysyllable count, where polysyllable count = number of words of more than two syllables in a sample of 30 sentences. This next table translates the higher levels of SMOG to an approximate grade level.

The Foley Lardner report has a much higher total score than the other five reports as its estimated grade level is the highest on all three measures. The Dykema Gossett report, by contrast, aims for a less sophisticated audience.


Word cloud plots to summarize key terms in survey responses

We can see in Clifford Chance Crossborder 2012 [pg. 22] an example of a word cloud plot. A word cloud presents text only. The size of each word corresponds to its relative frequency. The configuration and location of the words has no meaning, but generally the largest words, the most frequent, sit toward the middle. Nor does the color scheme convey information, except that terms of the same frequency have the same color.Howes Percival Social 2018 [pg. 9] provides another example. Before you can produce a word cloud you have to do a fair amount of massaging the text, such as dropping unimportant words, lower-casing words, and (often) stemming words.

Finally, in part because word clouds in survey reports rarely appear, we created one from all the methodology descriptions we extracted from various surveys.

We also prepared a plot to show how many characters are in the various methodology descriptions.


Interviews create or supplement surveys by law firms

Most people who follow surveys by law firms assume that the firms collect their data with an online questionnaire. Interviews by telephone, it turns out, play a significant role in survey data collection. In fact, a small number of reports indicate that the law firm, or an organization it commissioned, only conducted interviews to collect data. They did not use an online survey tool.

For example, Allen Overy Innovative 2012 [pg. 5] only used interviews: “Interviews tended to last for about an hour and followed a structured questionnaire.” During a structured interview, the person doing the interview follows a careful script of questions. The script assures that they stick to the same order and wording, and that they collect the same information even if multiple people carry out the interviews. In a way, an online survey questionnaire is a structured interview — but silent.

Ashurst GreekNPL 2017 [pg. 2] also collected its information solely through personal interviews instead of with online survey (50 interviews).

Proskauer Rose Empl 2016 [pg. 3] combined an online questionnaire and phone interviews of 100 people.

A variation on the previous method appears in Allen Overy Models 2014 [pg. 2]. That firm deployed two levels of interviews: “The views of 185 individuals were captured through 20-minute structured telephone conversations. A further 13 individuals participated in a longer in-depth interview.”

Sometimes firms compile their data from an online questionnaire, but then turn to selected interviews to gain depth and color. One example is Herbert Smith CorpDebt 2016 [pg. 2] which followed up with some participants to discuss the survey results. The same two-punch methodology was employed in Reed Smith LondonWomen 2018 [pg. 22], except that the firm went back to several participants who opted in. Pinsent Masons Energy 2017 [pg. 5] explains that “The survey included a combination of qualitative and quantitative questions, and all interviews were conducted over the telephone by appointment.”

CMS GC 2017 [25] pulled off a three-step information gathering, as explained in the snippet below. That firm conducted two surveys plus a series of interviews.

The ratio of interviews to online-survey participants varies widely and cannot always be determined from the survey report. White Case Arbitration 2010 [pg. 3] explains that its data comes from 136 questionnaires and 67 interviews, approximately a two-to-one ratio.

Text alignment on survey reports of law firms

By far the most common justification of type is flush left. One example will suffice, from Ashurst GreekNPL 2017 [pg. 4]. The alignment on the left is straight, flush left; the alignment on the right is jagged and therefor so is the margin.

ENSafrica Bribery 2016 [pg. 20] justifies the type on both sides.

Type justified on both sides is not the same as centered type. An example of the latter, centered type, comes from DWF Food 2018 on its cover.

For a change of pace in text alignment, Carlton Fields CA 2015 [pg. 2] arrayed its left-side text on a diagonal. Here is a snippet of the Table of Contents and later blocks of text [e.g., pg. 5] flash the same alignment.

Published reports from law-firm surveys (compared to unpublished)

Most law firms that go through the effort to collect online survey data proceed to publish their results in a report. Almost always those reports are available on the firm’s website in PDF format. Out of the 349 surveys currently collected, 65% of them (227 surveys) are available online in PDF format.  Or, in a few instances, this blogger obtained the reports directly from the law firm and it is possible that they are not available to the public on the firm’s website.

Another 22% of the surveys (78) are evidenced by a Word file created by the author that captures a press release or some other reference to a survey (in computer-speak, PDF = FALSE). Finally 13% are deemed “Missing” where this author knows about the survey, perhaps from a statement in an extant survey, but not even a Word file memorializes it. The survey report is missing in action.

For each column year in the plot, the light, yellow segment at the bottom conveys the number of surveys obtainable in PDF format (PDF = TRUE). The tiny green slivers in the middle represent the number of “Missing” surveys, and the remaining dark, purple segment at the top of each column represents the number of Word files.

It is likely that some of the missing and non-PDF surveys are in fact available in PDF format, but the arduous task of tracking them down and confirming that likelihood has not been completed. Also, we should note, in the last few years we have seen some survey reports published other than in PDF format. Firms have used new graphical-presentation software to create their research reports.

Average number of participants in surveys by law firms

Of the data set so far, we know the number of participants in 114 of the surveys. Three challenges have prevented knowing the participant numbers of the other 235 identified surveys. First, when no report has been published (or located by us), we often can’t know participant numbers from a press release or a reference to the survey in an article. Second, even when we have located a PDF of a published report, sometimes it does not provide that crucial fact of methodology. Third, we have not taken the time to extract from all the existing PDF reports their numbers of participants.

Extraordinarily, Osborne Clark Consumer 2018 obtained 16,000 participants. For this analysis and the associated plot, we have dropped that survey from the data set because otherwise its incredibly large response would skew the results.

The number of surveys for which participant numbers are available is low in the early years, but in the last decade the average number of participants hovers around the 250 mark. Over this entire set, the median number of participants is 210.

The plot with goldenrod columns divides all of the surveys for which we have participant data into 10 roughly equal ranges. They are equal because they have approximately the same number of surveys in each range, but the ranges themselves vary. To pick one for explanatory purposes, the range on the left, a dozen surveys collected data on 20-to-less-than-69 participants.