Advisable to use “Don’t know” or “NA” in multiple-choice questions

Well-crafted multiple-choice questions give respondents a way to say that they don’t know the answer or that no selection applies to their situation. The two non-answers differ in that ignorance of the answer — or, possibly refusal to give a known answer — can be remedied by the respondent whereas they can’t supplement an incomplete set of selections. Firms should not want people they have invited to take a survey to have to pick the least bad answer when their preferred answer is missing. As we have written before firms should add an “Other” choice with a text box for elaboration.

From HoganLovells Cross-Border 2014 [pg. 19] comes an example of how a multiple-choice question accommodates respondents who don’t know the answer. Also, it shows how data from such a question might be reported in a polar graphic. Seven percent of the respondents did not know whether their company’s international contracts include arbitration procedures.

In the jargon of data analysts, a “Don’t know” is called item non-response: no answer is given to a particular survey item when at least one valid answer was given to some item by the same respondent, e.g., leaving an item on a questionnaire blank, or responding to some questions by saying, “I don’t know,” while providing a valid response to other questions.

Another survey, DLA Piper Compliance 2017 [pg. 15], used a “Does not apply” option. Almost one-third of the respondents checked it. It is conceivable that some respondents did not know the answer and resorted to denying its applicability to them as the best of the three choices, although far from optimal.

One more example, this time from Fulbright Jaworski Lit 2009 [pg. 61]. Here, one-fifth of those who took the survey indicated that they didn’t know the answer to the question reproduced on top of the plot.

It is easy to include variations of the non-substantive selections described above. In fact, extrapolating from these three instances, firms probably should do so since significant numbers of respondents might pick them — on average almost one out of five in the above surveys.

Multiple-choice questions dominate the formats of questions asked

Having examined more than 100 reports published by law firms based on the surveys they sponsored, I suspected that more than three out of four questions asked on the surveys fell into the category of multiple choice. Reluctant to confirm that sense by laboriously trying to categorize all the questions in all those surveys, I invited my trusty R software select five of the surveys at random.

Sure enough, all five not only exceeded the perception of at least 75% of the questions being multiple choice, but in fact every single question that could be identified from the five reports fell into that format! Bear in mind that we can’t be certain about all the questions asked on the surveys, but we can glean from the reports most of them. It would be necessary to count from the actual questionnaire to confirm this data.

Specifically, Seyfarth Shaw Future 2017 went eight for eight, Morrison Foerster MA 2014 was five out of five, and Berwin Leighton Arbvenue 2014 used multiple-choice questions for all of its at least 14 questions (it is difficult to figure out from the Berwin report exactly how many questions were on the survey). In Foley Lardner Telemedicine 2014, all twelve questions (include three demographic questions) were multiple choice; with Foley Lardner Cars 2017, all 16 questions were multiple choice (including two demographic questions).

Of those 55 multiple choice questions, a few presented binary choices but most of them presented a list of 4-to-7 selections to pick from. Likert scales appeared rarely, as illustrated in the plot below from Foley Lardner Cars 2017 [pg. 5]. The scale ranges from “Strongly Agree” to “Strongly Disagree.”

Morrison Foerster MA 2014 [pg. 4] also used a Likert scale in a question.

Multiple-choice questions that ask for a ranking can yield deeper insights

If you want to capture more information than you can from simple multiple choice questions, then a ranking question might be best for you. For one of its questions, Berwin Leighton Risks (2014) [pg. 17] presented respondents with seven legal risks. The instructions told the respondents to rank the risks from 1 to 8 (where 1 was the most serious risk and 8 the least serious). [Right, 8 ranking choices for only 7 items!] Presumably no ties were allowed (which the survey software might have enforced.)

The report’s plot extracted the distribution of rankings only for the two most serious, 1 or 2. It appears that the plot tells us, for example, that 48 respondents ranked “Legislation/regulation” as a 1 or 2 legal risk (most serious). Other plots displayed the distribution of 3 and 4 rankings and less serious rankings.

A ranking question, especially one with as many as seven elements to be compared to each other, burdens participants, because to answer it conscientiously they need to consider each element relative to all the others. As a surveyor, you can never completely rely on this degree of respondent carefulness.

But ranking questions can yield fruitful analytics. Rankings are far more insightful than “pick the most serious [or whatever criterion],” which tosses away nearly all comparative measures. Rankings are more precise than “pick all that are serious,” which surrenders most insights into relative seriousness. Yet the infrequency of ranking questions in the law-firm research survey world is striking. Findings would be much more robust if there were more ranking questions.

Some people believe that rankings are difficult analyze and interpret. The visualization technique of Berwin Leighton that presents different views of the aggregate rankings belies that belief. Many other techniques exist to analyze and picture ranking responses.

A ranking question gives a sense of whether a respondent likes one answer choice more than another, but it doesn’t tell how much more. A question that asks respondents to allocate 100 percent among their choices not only ranks the choices but differentiates between them much more precisely than simple ranking. Proportional distribution questions, however, appear in law firm surveys even less than ranking questions. In fact, we could not find one among the hundreds of plots we have examined. Perhaps the reason is that these questions are even more complicated to explain to survey participants.

Challenges choosing categories, e.g., for revenue demographics

When a law firm invites its contacts to to take a survey, those people who accept probably form an irregular group in terms of the distribution of their corporate revenue.  Their revenue can range by the happenstance of self-selection and how the list was compiled from negligible revenue to many billions of dollars. When the firm’s report describes the revenue characteristics of the group, the firm must decide what ranges of revenue to use.

The firm might slice the revenue categories to put in them roughly equal numbers of participants. Doing this usually means that the largest category spans a wide range of revenue — “$3 billion and up” — whereas the smallest category tends to be narrow — “$0 to 100 million.” Such an imbalance of ranges results from the real-world distribution of companies by revenue: lots and lots of smaller companies and a scattering of huge ones (the distribution is long-tailed to the right). Stated differently, the corporate revenue pyramid displays a very, very broad base.

Alternatively, a law firm might choose to set the revenue ranges by some specific range values, perhaps “\$0-1 billion, \$1-2 billion, \$2-3 billion” and so on. The categories may make sense a priori, but binning revenue this way can result in very uneven numbers of participants in one or more of the categories depending on what categories are chosen, how narrow they are, and the vagaries of who responds.

Davies Ward Barometer (2010) [pg. 10] explained the corporate revenue ranges of its respondents in words. These are unusual ranges. The distribution skews toward remarkably small companies. Note from the last bullet that almost one out of three survey respondents “are not sure of their organization’s annual revenue.” Perhaps they do not want to disclose that revenue, as they work for a privately-held company. Or perhaps the organization has no “revenue,” but has a budget allocation as a government agency.

With a third approach, a firm fits its revenue categories to its available data set so that plots look attractive. You can guess when a firm selects its revenue categories to fit its data set. Consider the plot below from DLA Piper’s compliance survey (2017) [pg. 26]. The largest companies in the first category reported less than $10 million in revenue; the next category included firms with up to 10 times more revenue, but about the same percentage of respondents; the third revenue category again spanned companies with up to ten times more revenue, topping out at $1 billion, but close to the preceding percentages. Then we see a small category with a narrow range of $400 million followed by the two on the right with half the percentages of the left three. It appears that someone tried various revenue categories to find a combination that looks sort of good in a graphic.

The fullest way to describe the revenue attributes for participants turns to a scatter plot. From such a plot, which shows every data point, readers can draw their own conclusions about the distribution of revenue.

Multiple-choice questions put a premium on simplicity and clarity

The best surveys present questions that participants understand immediately. Short, clear and with familiar words — that’s the secret to reliable answers and to participants continuing on with the survey. Much more than fill-in-the-blank questions or give-your-answer questions, multiple-choice questions especially need simple and direct because participants have to absorb the question first and then slog through some number of selections.

Some questions demand quite a bit from the participant. What is the complexity level of the question shown at the top of the image below, taken from Pinsent Masons TMT 2016 [pg. 20]? The person tackling that question had to juggle the broadness of “considerations,” bring to mind comprehensive knowledge of the company’s dispute resolution policy (recalling the meaning of “DR”), and apply both sensibilities in the context of arbitrations. Even though this question handles a complex topic quite succinctly, the cognitive load on the participant piled up.

For question designers, a cardinal sin includes “ands” or “ors.” When a conjunction joins two ideas, a “double barrel question” in the evocative term from the textbook, Empirical Methods in Law, by Lawless, Robert M., Robbennolt, Jennifer K. and Ulen, Thomas S., Wolters Kluwer, 2nd Ed. 2016 at 67,  it asks whether X and Y both are true. What if X is true but not Y, or Y but not X? How does a respondent answer half of a conjunction?

Feel the cognitive schism of a conjunction from the question asked in Gowling WLG Protectionism (2017) [pg. 13]. Some participants might believe that their sector is aware of the risks of protectionist policies but hasn’t prepared how to respond to them (i.e., the sector is on notice but not ready to act). What is the right answer for those participants?

Alternatively (or disjunctively), to the question whether X or Y is true, when the analysis step arrives, a firm can’t disentangle X from Y is since they have been annealed. X could be true and Y could be false, or the reverse.

We will close with one more example of both complexity and conjunction. [pg. 14] confronted respondents with seven selections, several of which were complex and one of which included a conjunction [the fourth from the top, “Breakdown … and the rise …”]. As with the Gowling question, this selection might leave a participant in a bind if one part of the selection holds true but not both parts.

Order of selections in multiple-choice questions

Since participants are expected to read all the selections of a multiple-choice question, the order in which you list them may seem of little moment. But the consequences of order can be momentous. Respondents might interpret the order as suggesting a priority or “correctness.” For example, if the choice that the firm thinks will chosen most commonly stands first, that decision will influence the data in a self-fulfilling pattern. The firm thinks it’s important — or, worse, would prefer to see more of that selection picked — and therefore puts it first, while respondents are influenced by supposing that privileging to be true and choose it.

Or participants may simple tire of evaluating a long list of selections and deciding which one or more to choose. They may unknowingly favor earlier choices so that they can declare victory and move on to the next question.

Let’s look at a question from the King & Spalding survey on claims professionals (2016) [pg. 15], not in any way to criticize the question but to illustrate the possibility of the skews described above.

We don’t know enough about claims professionals or lines of insurance to detect whether this selection order nudges respondents, but clearly the selections are not in alphabetical order. When selections appear in alphabetical order, the assumption is that the firm tried to randomize the order and thereby avoid guiding respondents.

Another option for a firm is to prepare multiple versions of the survey. Each version changes the order of selections of the key multiple-choice question or questions. The firm sends those variants randomly to the people invited to take the survey. So long as the text of the selections remains the same, the software that compiles results will not care about variations in selection order.

A more sophisticated technique to eliminate the risk of framing relies on the survey software to present the selections in random order for each survey taker. In other words, the order in which person A sees the selections is randomly different than the order in which person B sees the selections.

Published reports infrequently restate the exact question asked and never the arrangement of selections. All the reader has to go by is the data as reported in the text, table or graphic. Because the summary of the data usually starts with the most common selection and discusses the remaining results in declining order, the original arrangement of selections is not available.

For example, here is one multiple-choice question from Davies Ward Barometer (2010) [pg. 58]. At the top, the snippet provides the text of the report which gives a clue to the question asked of respondents. Nothing gives a clue about the order of the selections on the survey itself.

As an aside, consider that this survey followed several prior surveys on the same topic. It is possible that the order of the selections reflects prior responses to a similar question. That would be a natural thing to do, but it would be a mistake for the reasons described above.

Priority of demographic attributes (the four most common)

Having studied more than 70 survey reports by law firms, I sensed that the demographic attributes recognized by the firms exhibit a fairly consistent priority. First, and thus most importantly, firms focus on respondent position, then respondent company’s industry, business model and location. That priority order for the four demographic characteristics makes sense.

The rank of the person completing the survey suggests their depth of knowledge of the topic. You want general counsel giving their views more than junior lawyers who have just joined the company. You seek C-suite executives, not assistant directors. The level or position also signals the ability of the firm to reach decision makers and persuade them that their time is well spent taking the survey. Implicitly, a high proportion of busy leaders says “This topic has significance.”

Industry (sector) comes next on the priority list because legal issues impinge on each industry differently. Also, readers of a survey report not only want to know that it speaks for companies in their industry but also they also would like to see how the results differ industry by industry.

“Business size” is my term for the third-most-common demographic. The typical measure relies on the annual revenue of the company. Most surveys proudly state that they have a good number of large companies as those companies are more prestigious (and are probably the targets of business development efforts by the firm). A less common business size is number of employees. For non-profits and government agencies revenue has less relevance (budget may be the better metric), but all organizations have employees. Still measure often gives less insight for profit-seeking organizations as it can vary enormously across industries and indeed within industries.

The fourth-most-common demographic regards the geography of respondent organizations, either its country, region or continent. Quite a few surveys, however, collect only participants from a single country and therefore ignore this demographic attribute. [We did spot one survey that broke out respondent data by states in Australia.]

We chose three surveys to spot test the relative importance they attach to their demographics.

  • The Dykema Gossett survey of merger and acquisition specialists (2017) gathered data on the position of its respondents, the sector in which their company operates, and their company’s revenue. The firm’s report did not attach numbers of participants or percentages to any of the demographic attributes but it described them in that order.
  • The Carlton Fields survey of class actions (2014) likewise summarized its participants by position, sector and revenue, in that order, but disclosed nothing further.
  • Of the spot-tested reports, by far the best handling of demographics comes from the Baker McKenzie cloud survey (2017). That report precisely states breakdowns by geography, position, industry sector, and number of employees. Even better, the report includes plots that visualize these attribute details. Baker McKenzie described the position of of individual respondents with seven choices of functions (IT, Sales, Legal, etc.) but the firm did not provide revenue data. In other respects, however, it commendably shared the demographics of its survey population. The order of presentation was geography, position, and business model.  Interestingly, for geography the report uses a map to convey where their “top respondents” came from.

If we had full data on the treatment of demographic attributes by all the surveys available to us, our inductive sense of these priorities would be confirmed or overturned. Perhaps in another post. Meanwhile, note two points. First, which demographics are important depends on the purpose of the research. Second, the report ought to take advantage of the demographic data; to create analytic value, somewhere the report should break out the findings by demographic segments.

Techniques to reduce mistakes by respondents

What can a firm do to improve the likelihood that respondents answer multiple-choice questions correctly? The substance of their answer is known only to them, but some methodological trip-ups have solutions. To address the question, we can revisit the failure points that we presented above.

Reverse the scale. One step to identify a misreading asks a second question to confirm the first answer. So, if the first question asks for a “1” to indicate “wholly ineffective” on up to a “10” to indicate “highly effective,” a later question might present the choices and ask the respondent to pick the most effective one. If that choice did not get a high number (8, 9 or 10, probably) on the first question, you have spotted a potential scale reversal. If you decide to correct it, you can manually revise the ratings on the first question. Second, using different terms for the poles might improve accuracy, although at a cost of some consistency and clarity. Thus, the scale might be a “1” to indicate “wholly ineffective” on up to “10” to indicate “highly productive.” Respondents are more likely to notice the word or phrase variability and get the scale right.

Misread the question. Sometimes, next to the answer choices you can repeat the key word. Seeing the key word, such as “most inexpensive”, a respondent will catch his or her own misreading. As with scale reversals, here too a second question might confirm or call out an error. Alternatively, a firm might include a text box and ask the respondent to “briefly explain your reasoning.” That text might serve as a proof of proper reading of the question.

Misread selections. In addition to the remedies already discussed, another step available to a firm is to write the selections briefly, clearly, and with positives. “Negotiate fixed fees”, therefore, improves on “Don’t enter into billing arrangements based on standard hourly rates.” Furthermore, don’t repeat phrases, which can make selections look similar to a participant who is moving fast. “Negotiate fixed fees” might cause a stumble if it is followed by “Negotiate fixed service.”

Misread instructions. The best solution relies on survey software that rejects everything except numbers. That function should screen out the undesirable additions. The downside is that participants can grow frustrated at error messages if they do not tell them clearly the cause of their mistake: “Please enter numbers only, not anything else, such as letters or symbols like $.”

Fill in nonsense when answers are required. As mentioned, sophisticated software might detect anomalous selections, but that leads to dicey decisions about what to do. An easier solution is to keep the survey focused, restrict selections to likely choices (and thus fewer of them), and make them interesting. Sometimes surveys can put in a question or step that reminds participants to pay attention.

Give contradictory answers. Again, in hopes of trapping contradictions law firms can structure the question set to include confirmatory questions on key points. The drawback? A longer survey. Alternatively, some firms might email respondents and confirm that they meant to give answers that conflict with each other. Likewise, interviews after the survey comes back may smoke out corrections.

Become lazy. Keep the survey short, well-crafted, and as interesting as possible for the participant. Perhaps two-thirds of the way through a firm could ‘bury’ an incentive button: “Click here to get a $15 gift certificate.” Or a progress bar displayed by the survey software can boost flagging attention (“I’m close, let’s do a good job to the end….” .

Too quickly resort to “Other”. Despite the aspiration to achieve MECE (mutually exclusive, comprehensively exhaustive), keep selections short, few, and clear. Pretesting the question might suggest another selection or two. Additionally, a text box might reduce the adverse effects of promiscuous reliance on “Other”.

Irregular disclosure of demographics from one survey to the next in a series

How consistently do law firms track and disclose demographic data? Not very consistently, we found, based on three pairs of surveys conducted by different firms: Foley Lardner on Telemedicine in 2014 and the follow-up in 2017, Seyfarth Shaw on real estate in 2016 and the follow-up the year after, and Proskauer Rose on employment in 2016 and 2017. Before studying those survey pairs, I had thought that firms would stick pretty closely to the way they treated demographics in their first survey, perhaps modifying and improving them a little bit for the follow-on survey. Not true, not at all!

The plot below attempts to summarize how the second survey of each pair compares to the first survey with respect to demographics. Each bar has a segment for the five demographic attributes in the legend. Each segment can be a zero if the report does not include it, or a 1 if the report’s disclosure is minimal, on up to a five for a very good disclosure.

If a segment in the second column is higher than its counterpart in the first column, then the second survey improved on the first one. Perhaps it went from a 3 to a 4. Typically, that would mean the second report had breakdown with more categories or more information on percentages. For example in 2014 Foley Lardner (“Foley First” on the bottom axis) reported on five levels with percentages for each of respondent size (employees, or revenue), organization type, and position of the respondent. In the second report three years later, even with 50 more participants than in the first year, the firm combined two of those levels (and gave the percentage), but gave no other information. Thus, its first column’s segment for level (the second from the top, in light blue) starts as a five but drops in the firm’s second column (“Foley Second”) to a one. The first report did well on number of employees or revenue (red at the bottom) but that demographic information disappeared in the second survey report.

 

Taking another example, in its inaugural survey Proskauer Rose did not provide details about the locations of its respondents (as indicated by the absence of the light-yellow segment), but provided some information about that attribute in its subsequent survey report (the second segment from the top of the column labelled “Proskauer Second”).

Oddly, Foley & Lardner broke out three kinds of hospitals in its first year but combined them all in its second year. Two other categories of organization type matched, but two new ones appeared the second year.

The number at the bottom of each column tells how many participants that survey had. Hence, it is also odd that the three firms saw significant increases in the number of respondents year-over-year. However, they did choose to elaborate on their demographic reporting.

In other words, given the irregular disclosure of data about respondents on these five important attributes, it is difficult to know how well the two sets of respondents resemble each other.

Demographic data tailored for survey

We have written extensively about demographic data and how law firms report it from their research. Some kinds of demographic data figure prominently and consistently in reports, such as what we have termed the Big 4 demographics. But some surveys explore topics that justify other, one-off demographics. We show six of them below as displayed in reports issued by six different law firms.


HoganLovells, researching foreign direct investment [pg. 74], shows in the plot on the left 15 roles the firm asked about. On the right, White & Case’s research into arbitration collected demographic data about respondents’ legal background [pg. 52].

Moving from the two plots above to the two below, Davies Ward, interested in Canadian lawyers, sought answers on the years respondents had been practicing law [pg. 9]. Norton Rose Transport [pg. 2] turned to a donut plot to display 11 roles within four industries.

 

 

 

In the final two instances, in the left plot Winston Strawn looked at risk [pg. 31]. Its questionnaire asked not just about where the respondent’s company was based but also where the respondent individually was based .

And Foley Lardner studied telemedicine [pg. 10] in the first of a series, starting in 2014, and drilled down on types of healthcare organizations.

 

 

 

 

 

 

 

 

 

 

 

 

 

Law firm research surveys might ask for all kinds of background, profile data that illuminates their findings. It is easy to think of examples, such as patent records outstanding for research into intellectual property practices or the age of respondents for research into demographics.