Reporting actual data is better than categories

Let’s give some thought to ranges checked when the surveyor wants to figure out averages or medians. The double-plot image below brings up the point. It comes from Davies Ward Phillips & Vineberg, “2010 In-House Counsel Barometer” at 9 (2010).

Considering the plot on the left, a reader should assume that the survey question was as stated at the bottom of the graphic: “Question: How long have you been a lawyer?” and that four selections were available for respondents to check. They were, one assumes, “<5 years”, “5-9 years”, “10-19 years” and “20+ years”.  Hence the pie chart plot has slices for each of those four bins.

If those assumptions are right, however, the firm could not have stated above the plot that “On average, in-house counsel have practiced law for 16.3 years …”. When a survey collects information in ranges, no analyst can calculate averages from that form of aggregated data. If 17% of the respondents have practiced law between five and nine years, it is not possible to calculate an average even for that single range let alone all four categories. So Davies Ward must have asked for a numeric answer on the questionnaire and created the four bins afterwards.

Why didn’t the firm share the more detailed information? After all, when analysts bin data, they jettison information. Furthermore, subjectivity enters in when someone allocates data to categories on the questionnaire or after the fact.

It would have been better to create a scatter plot and thereby show all the data points. That way, readers can draw their own conclusions about the pattern of the distribution.

Sometimes surveyors have concerns that individual points on a scatter plot could be tied to a specific respondent (like the longest-practicing lawyer or the highest paid lawyer). But analysts can sidestep such concerns with a box plot that tells more than the percentages in bins.

Research surveys by the ten largest law firms

My initial data set of law-firm research surveys developed serendipitously. As I gathered legal industry surveys over the past couple of years, I found several that were sponsored by law firms. Having started to analyze that set, it occurred to me to look at the largest firms in the world.

According to the American Lawyer and in order of declining numbers of lawyers, the ten most gargantuan firms consist of Baker McKenzie, DLA Piper, Norton Rose Fulbright, Hogan Lovells, Jones Day, Latham \& Watkins, White \& Case, Greenberg Traurig, Morgan, Lewis, and Sidley Austin. I searched on Google for research surveys sponsored by each of them using the simple term of the first two names of the firm plus the word “survey,” e.g., “Baker McKenzie survey”. I then read over the first five or six pages returned by Google and did my best to spot research surveys.

One could certainly shoot holes in this methodology. Also I should point out that I treated a series of surveys hosted by a firm over several years as a single survey. I also did not include in my account compilations by any of the firm of laws or regulations, which some law firms call surveys. It might also be that terms like “poll” or “straw vote” or “questionnaire” would have uncovered other examples.

For several of the firms I already had at least one survey and I combined what I had with what I found online. The plot below shows the results of my poking around online and preexisting surveys. It plots the number of research surveys found per thousand lawyers of the firm. The standardization of per-thousand-lawyers accounts for the likelihood that firms with more lawyers produce more research surveys. With this standardization, a 2,000 lawyer firm with two surveys has outproduced and 4,000 lawyer firm with three surveys, on a survey per lawyer basis.

My searches on the four law firms at the lower right (Latham \& Watkins, Jones Day, and Greenberg Traurig) turned up no research surveys. If any reader knows of research surveys by the ten largest, or by any other law firm, I would appreciate hearing from you about them.

Ranking law firms on disclosure of four demographic attributes

The reports at hand deal each in their own way with the four basic demographic attributes (position of respondent, industry, geography, and revenue). We can visualize the relative levels of disclosure by applying a simple technique.

The technique starts with the category assigned to each law firm for a given demographic attribute. For example, we categorized the firms on how they disclosed the positions of respondents with four shorthand descriptions: “No position information”, “Text no numbers”, “Some numbers”, and “Breakdown and percents”. It’s simple to convert each description to a number, such as in our example with one standing for “No position information” up to four standing for “Breakdown and percents.” The same conversion of text description to an integer counterpart was done to the other three demographics, where each time the higher number indicates a better explanation of the report regarding that demographic attribute.

Adding the four integers creates a total score for each firm. The plot below shows the distribution of those total scores by firm.

The firm that did the best on this method of assessment totaled 15 points out of a maximum possible of 15 (three times four categories plus one times three categories for the demographic attribute that had only three levels). At the other end, one firm earned the lowest score possible on each of the four attributes, and thus a total score of four. [Another plot could break up the bar of each firm into the four segments that correspond to the four demographic attributes.]

Our hope is that someday every law-firm research survey will disclose in its report breakdowns by these fundamental attributes together with the percentage of respondents in each. By then, perhaps another level of demographic disclosure will raise the bar yet again.

Disclosure of respondents’ revenue through multiple choice questions

In comparison to the demographic attributes reviewed so far (i.e., the disclosure and explanation of respondents’ positions, geographies, and industries), respondent revenue turns out to be not only the least elaborated but also the least standardized. This relatively poor showing may have happened because the respondents didn’t know or didn’t want to disclose their organization’s revenue, so the surveying law firm felt the data it collected was too fragmented. It might also have been that the firms did not think that corporate revenue would make a systematic difference in the answers given nor would it aid in the analysis of the data. On the darker side of interpreting the poor showing of revenue categories and percentages, it might be that the firms sensed that their mix of participants displayed unimpressive revenue.

In any event, my examination of 16 survey reports found that three categories cover the variability of disclosure.

Clear and full breakdown: A trio of law firm reports help readers gauge the annual turnover and distribution of the survey respondents’ organizations by breaking out their revenue into three-to-six category ranges. Across the three firms, their ranges started at less than $500,000 but went up to more than $20 billion. Of the fifteen different ranges used, only one of them — $5 billion to $10 billion — appeared more than once. For each range, these three firms included the percentage of respondents whose revenue fell within the range.

Some facts but incomplete breakdown: Six firms stated something about revenue in their report but unlike the three firms described above they did not provide a full breakout with ranges or percentages. For example, one firm wrote ‘Almost half of the survey respondents work for businesses with annual revenues of $1 billion or more’ and in a footnote added ‘The average respondent in this data set has revenue of $750 million.’ Plots in the report show the firm recognized five revenue categories: Less than $50M, $50M-$500M, $5000M-$1BN, $1BN-$6BN, and Over $6BN. Another firm offered, unhelpfully, that the companies represented ‘were of a variety of sizes’ and then broke them out by market capitalization (Large cap at 23% [more than $4 billion in market capitalization], mid cap at 21% [$1 to $4 billion] and small cap [less than $1 billion]). Two more instances: ‘Survey participants’ companies had an average annual revenue of $18.2 billion and median annual revenue of $4.7 billion’ and ‘A majority of companies (82) had revenues of Euro 1 billion or more.’

No facts about revenue. Disappointingly, the seven remaining reports provided no information whatsoever about the annual revenue of their respondents’ organizations. It is possible, to be sure, that corporate revenue has no bearing on the findings produced by the survey and summarized in the report. But that seems to me unlikely to be true.

The pie chart below visualizes the three categories described above.

Disclosure of participant industries varies widely

As we did with positions of respondents and geographic locations of respondents, we pored over 18 reports of law firms that came from their research surveys. 1

For this review, we focused on the demographics disclosed about participants’ industries (sometimes referred to as ‘sectors’).

Four classes of disclosure describe the data set.  (1) Several surveys chose not to share industry data — perhaps not having collected the information through the questionnaire, or in one case the survey collected the data but the report did not include a breakdown.  (2) Other surveys described the industries covered in the study with text only.  For example, one wrote that its report covers “industries including consumer discretionary, consumer staples, energy, financials, health care, industrials, information technology, materials, real estate, telecommunication services, and utilities.”

(3) A handful of firms categorized their participants’ industries more clearly with text and some sense of numbers. (4) The best reports told readers what industries their participants operated in and what percentage of them were in each of those industries. Leading this group of fulsome disclosers were Berwin Leightner Paisner with 10 industries plus percentages and Hogan Lovells with 18 industries!

Here is a line plot that shows how many reports fall into each of the four classifications.

Firms choose a mix of standardized industry names — albeit with tweaks of spelling, abbreviations and punctuation — and idiosyncratic industry names. Industries named specifically (verbatim from the reports) include Automotive, Banking, Business services, Construction, Energy, Energy, Energy/Utilities, Engineering, Financial institutions, Financial institutions, Financial Services, Financial Services (twice), Food/farming/fish, Health Care, Healthcare, Independent Producers, Infrastructure, mining and commodities, Insurance, Investment, IT, Media & Telecomms, Large financials, Legal services, Life sciences and healthcare, Manufacturing (four times), Other (twice), Pharma/bioscience, Pharma\Life Sciences, Private equity, Professional services, Public services, Real estate, Resources/mining, Retail and Leisure, Retail/wholesale, Technology and innovation, Trans/logst/dist, Transport (twice), and Transport and Logistics.

Surely there could be a standard industry breakdown that would accommodate most research surveys by law firms! With a common naming convention, readers could better draw on data from across multiple surveys.

Notes:

  1. Reports by Allen & Overy, Berwin Leightner Paisner, Carlton Fields, Davies Ward Phillips, Eversheds, Fish & Richardson, Goulston & Storrs, Haynes and Boone, Hogan Lovells, K&L Gates, Littler Mendelson, Norton Rose Fulbright, Proskauer Rose, Ropes & Gray, Seyfarth Shaw, White & Case, and Winston & Strawn

Pages of research-survey report devoted to marketing

Once a law firm goes through the effort to design and conduct a survey, then analyze the data and prepare a report, management certainly hopes for a return on that investment. At the top of the list would be calls from prospective clients asking about the firm’s services related to the survey’s topic. Furthermore, the firm would like potential clients to think more favorably of the firm and its contribution to knowledge (the oft-used term, “thought leadership”). Other benefits of surveys come to mind, but this post is about an aspect of marketing: how much space the survey report devotes to touting the firm.

All the reports have a portion that is “About the Firm.” I estimated how much those sections occupied using a notion of full-page equivalent (FPE). Usually, the description of the firm and its services takes a full page or two, which made it easy to count the FPE. Other firms devoted only part of a page to self-promotion, so I estimated the percentage of a full page that the section took up. I did not include forwards or quotes from partners and only considered pages if there were some text about the firm (i.e., not cover pages or back covers that have the firm’s name).

The resulting data is in the plot below, which has converted each of the 16 firm’s FPEs into a percentage of all the pages in the firm’s report.

 

With the exception of the firm at the top, most of the firms were relatively reticent with respect to their self-descriptions. After all, at least they can be expected to include some contact information. If you assume some bare minimum of firm information is justified, then the length of the report significantly determines the resulting percentage. Shorter reports tend to have a higher percentage of report pages devoted to the firm.

Standardize and quantify participants by region

Continuing in the same vein regarding multiple-choice questions and the standardization of some demographic categories, we looked at how the law firm research surveys 1 identified their participants by geographic region. As with positions (and as will be seen with industry sectors) both the completeness of disclosure and the categores used to describe regions were all over the map.

One laggard proffered no information at all about the geographic dispersion of its respondents. Six of the law firms stated (or the reader could infer) that they gathered responses from a single country and they identified that country. Four other firms made general statements in the text of their report about geographic coverage (e.g., Allen & Overy stated that they surveyed companies “around the world” that were “in 27 different countries”) but provided no breakdown in terms of absolute numbers or percentages.

In line with good survey practice, however, five firms broke their participants down by percentages in regions. One firm’s report had two regions, two reports had for regions, one five, and one six.

Below is the information in the preceding paragraphs in graphical form.

As to the regions used to categorize participants, the 16 research-survey reports we examined produced a grand total of the same number of regions — 16, with very little standardization. The report used these descriptions: “Africa, “Americas, “Asia, “Asia Pacific, “Canada, “Continental Europe, “EMEA, “Europe, “Latin America, “Middle East and Africa, “Non-US, “North America, “Oceana, “Other, “United Kingdom” (or “U.K.”), and “United States”.

The take-away from this follows the lessons previously learned: the legal industry and its data analytics would be stronger if there were a more standard way of naming the regions from which participants come. Second, law firms that conduct research surveys should identify the countries or regions, ideally using standard terminology, from which their participants came as well as the percentage breakdown.

Notes:

  1. Reports by Allen & Overy, Berwin Leightner Paisner, Carlton Fields, Davies Ward Phillips, Eversheds, Goulston & Storrs, Haynes and Boone, Hogan Lovells, Littler Mendelson, Norton Rose Fulbright, Proskauer Rose, Ropes & Gray, Seyfarth Shaw, White & Case, and Winston & Strawn

Need for standardized position descriptions

The laxity in describing the respondent sample by position and percentage in each position should trouble those who want to rely on law-firm research surveys. How much credibility does data have if you don’t know who provided that data? But even if a firm classifies its respondents by level, the review I carried out raises another troubling point: the position descriptions vary wildly.

More specifically, in my set of law firm research surveys 1, the firms used a grand total of 16 different descriptions for titles in law departments. The position “General Counsel” was most common but after it the terminology was all over the place.

These are the actual terms used for the legal positions, listed roughly in descending order of corporate level: Chief Legal Officer, General Counsel, Chief Legal Officer/General Counsel (or other head of in-house legal function), Region or Division General Counsel (or equivalent), Chief Legal or Associate/Deputy General Counsel, Deputy General Counsel, Assistant General Counsel, Associate/Deputy/Assistant General Counsel, direct reports to general counsel, senior lawyers, Head of litigation, Senior Counsel, in-house counsel, in-house attorneys/corporate counsel, in-house lawyer, and Counsel. Several of the reports also included an “Other” category for respondents in the law department whose position was not among the selections.

The methodology sections of these reports do not say whether the online questionnaire explained what was meant by the selections of the multiple-choice questions for positions, such as years of experience or scope of responsibility. Probably they simply listed some number of selections and ask respondents to pick one of them (possibly with “Other”) to answer a seemingly simple question question: “What is your level, title or position?” The firm that gave a common title and then added “or its equivalent” showed thoughtfulness. It recognized that titles have proliferated in law departments so the firm was using a label to try to convey the level of responsibility of the respondent.

As for titles of respondents outside the law department, those also vary widely. Quoting from the reports, they include CEO/Director, C-suite executives, senior-level executives, Senior management, Executives, Human Resources Professionals, Risk/compliance, other professionals, and other business contacts.

As with the law department positions, the firms had no standard set of positions to draw from, so they came up with their own categories. Going forward, it would help the legal industry and its movement toward more data analytics to have at least a core of standardized position terms for the law department respondents and client respondents to choose from.

Notes:

  1. Reports by Allen & Overy, Berwin Leightner Paisner, Carlton Fields, Davies Ward Phillips, Eversheds, Goulston & Storrs, Haynes and Boone, Hogan Lovells, Littler Mendelson, Norton Rose Fulbright, Proskauer Rose, Ropes & Gray, Seyfarth Shaw, White & Case, and Winston & Strawn

Provide respondent numbers and positional breakdown

Readers of a law firm’s research-survey report 1 want from the start to assess the credibility of the findings. As that credibility in large measure rests on the quality of the survey’s respondents, readers would like to know how many people responded and what were their positions (aka levels, roles, or titles).

A review of survey reports I have collected suggests that we can capture the methodological disclosures of respondent positions by five classifications. Here they are in increasing praiseworthiness.

  1. No information. Woefully, two of the reports lack data on both the number of respondents or their positions. How much stock can anyone put in findings from a black box?
  2. Total respondents. One report, by a very eminent law firm, only disclosed how many respondents its survey had collected, but nothing about their positions.
  3. Total respondents and a broad position. As far as I could glean, five firms told how many people had participated (the number of respondents), but gave only the most general description of their position. How useful is it to know they were “senior management?”
  4. Total respondents and some position breakdowns. Three firms went a step farther and gave some breakdown of the respondents’ positions. One of those firms gave the percentage of total respondents in one broad category, but oddly provided no other quantitative data regarding positions.
  5. Total respondents and full percentage breakdown. A half dozen firms did well: They laid out how many respondents their survey had, broke them down by three-to-five positions, and provided the percentage of respondents in each position. These six firms win the coveted Golden Data Analyst Award: Berwin Leightner Paisner, Davies Ward Phillips, Littler Mendelson, Norton Rose Fulbright, White & Case, and Winston & Strawn.

This detail of disclosure should be the minimum standard for all research surveys conducted by law firms. Tell your readers about who provided data to you and give a clear, quantitative decomposition of the percentage at each position 2.

Notes:

  1. My set includes reports by Allen & Overy, Berwin Leightner Paisner, Carlton Fields, Davies Ward Phillips, Eversheds, Goulston & Storrs, Haynes and Boone, Hogan Lovells, Littler Mendelson, Norton Rose Fulbright, Proskauer Rose, Ropes & Gray, Seyfarth Shaw, White & Case, and Winston & Strawn.
  2. It is for another time to consider weighting the responses of people by their level of seniority.

Number of selections in multiple-choice questions

I wanted to know how many selections are typical in multiple-choice questions asked in law-firm research surveys 1, To start figuring that out, I picked the Davies Ward report from 2011.

Since I do not have the actual online survey distributed by any of the firms, and therefore can’t see which questions were multiple choice, I had to reverse engineer the Davies Ward report to make the best determination possible. To the firm’s huge credit, however, its report helps in that determination because it states the question asked on the survey at the bottom of the plot that displays the results from that question. Studying those plots and the question asked, I identified 16 questions that were probably multiple-choice.

It helps to understand the findings to see them visualized.

Continue reading “Number of selections in multiple-choice questions”

Notes:

  1. My set includes reports By Allen & Overy, Berwin Leightner Paisner, Carlton Fields, Davies Ward Phillips, Eversheds, Goulston & Storrs, Haynes and Boone, Hogan Lovells, Littler Mendelson, Norton Rose Fulbright, Proskauer Rose, Ropes & Gray, Seyfarth Shaw, White & Case, and Winston & Strawn