Advisable to use “Don’t know” or “NA” in multiple-choice questions

Well-crafted multiple-choice questions give respondents a way to say that they don’t know the answer or that no selection applies to their situation. The two non-answers differ in that ignorance of the answer — or, possibly refusal to give a known answer — can be remedied by the respondent whereas they can’t supplement an incomplete set of selections. Firms should not want people they have invited to take a survey to have to pick the least bad answer when their preferred answer is missing. As we have written before firms should add an “Other” choice with a text box for elaboration.

From HoganLovells Cross-Border 2014 [pg. 19] comes an example of how a multiple-choice question accommodates respondents who don’t know the answer. Also, it shows how data from such a question might be reported in a polar graphic. Seven percent of the respondents did not know whether their company’s international contracts include arbitration procedures.

In the jargon of data analysts, a “Don’t know” is called item non-response: no answer is given to a particular survey item when at least one valid answer was given to some item by the same respondent, e.g., leaving an item on a questionnaire blank, or responding to some questions by saying, “I don’t know,” while providing a valid response to other questions.

Another survey, DLA Piper Compliance 2017 [pg. 15], used a “Does not apply” option. Almost one-third of the respondents checked it. It is conceivable that some respondents did not know the answer and resorted to denying its applicability to them as the best of the three choices, although far from optimal.

One more example, this time from Fulbright Jaworski Lit 2009 [pg. 61]. Here, one-fifth of those who took the survey indicated that they didn’t know the answer to the question reproduced on top of the plot.

It is easy to include variations of the non-substantive selections described above. In fact, extrapolating from these three instances, firms probably should do so since significant numbers of respondents might pick them — on average almost one out of five in the above surveys.

Multiple-choice questions dominate the formats of questions asked

Having examined more than 100 reports published by law firms based on the surveys they sponsored, I suspected that more than three out of four questions asked on the surveys fell into the category of multiple choice. Reluctant to confirm that sense by laboriously trying to categorize all the questions in all those surveys, I invited my trusty R software select five of the surveys at random.

Sure enough, all five not only exceeded the perception of at least 75% of the questions being multiple choice, but in fact every single question that could be identified from the five reports fell into that format! Bear in mind that we can’t be certain about all the questions asked on the surveys, but we can glean from the reports most of them. It would be necessary to count from the actual questionnaire to confirm this data.

Specifically, Seyfarth Shaw Future 2017 went eight for eight, Morrison Foerster MA 2014 was five out of five, and Berwin Leighton Arbvenue 2014 used multiple-choice questions for all of its at least 14 questions (it is difficult to figure out from the Berwin report exactly how many questions were on the survey). In Foley Lardner Telemedicine 2014, all twelve questions (include three demographic questions) were multiple choice; with Foley Lardner Cars 2017, all 16 questions were multiple choice (including two demographic questions).

Of those 55 multiple choice questions, a few presented binary choices but most of them presented a list of 4-to-7 selections to pick from. Likert scales appeared rarely, as illustrated in the plot below from Foley Lardner Cars 2017 [pg. 5]. The scale ranges from “Strongly Agree” to “Strongly Disagree.”

Morrison Foerster MA 2014 [pg. 4] also used a Likert scale in a question.

Multiple-choice questions that ask for a ranking can yield deeper insights

If you want to capture more information than you can from simple multiple choice questions, then a ranking question might be best for you. For one of its questions, Berwin Leighton Risks (2014) [pg. 17] presented respondents with seven legal risks. The instructions told the respondents to rank the risks from 1 to 8 (where 1 was the most serious risk and 8 the least serious). [Right, 8 ranking choices for only 7 items!] Presumably no ties were allowed (which the survey software might have enforced.)

The report’s plot extracted the distribution of rankings only for the two most serious, 1 or 2. It appears that the plot tells us, for example, that 48 respondents ranked “Legislation/regulation” as a 1 or 2 legal risk (most serious). Other plots displayed the distribution of 3 and 4 rankings and less serious rankings.

A ranking question, especially one with as many as seven elements to be compared to each other, burdens participants, because to answer it conscientiously they need to consider each element relative to all the others. As a surveyor, you can never completely rely on this degree of respondent carefulness.

But ranking questions can yield fruitful analytics. Rankings are far more insightful than “pick the most serious [or whatever criterion],” which tosses away nearly all comparative measures. Rankings are more precise than “pick all that are serious,” which surrenders most insights into relative seriousness. Yet the infrequency of ranking questions in the law-firm research survey world is striking. Findings would be much more robust if there were more ranking questions.

Some people believe that rankings are difficult analyze and interpret. The visualization technique of Berwin Leighton that presents different views of the aggregate rankings belies that belief. Many other techniques exist to analyze and picture ranking responses.

A ranking question gives a sense of whether a respondent likes one answer choice more than another, but it doesn’t tell how much more. A question that asks respondents to allocate 100 percent among their choices not only ranks the choices but differentiates between them much more precisely than simple ranking. Proportional distribution questions, however, appear in law firm surveys even less than ranking questions. In fact, we could not find one among the hundreds of plots we have examined. Perhaps the reason is that these questions are even more complicated to explain to survey participants.

Multiple-choice questions put a premium on simplicity and clarity

The best surveys present questions that participants understand immediately. Short, clear and with familiar words — that’s the secret to reliable answers and to participants continuing on with the survey. Much more than fill-in-the-blank questions or give-your-answer questions, multiple-choice questions especially need simple and direct because participants have to absorb the question first and then slog through some number of selections.

Some questions demand quite a bit from the participant. What is the complexity level of the question shown at the top of the image below, taken from Pinsent Masons TMT 2016 [pg. 20]? The person tackling that question had to juggle the broadness of “considerations,” bring to mind comprehensive knowledge of the company’s dispute resolution policy (recalling the meaning of “DR”), and apply both sensibilities in the context of arbitrations. Even though this question handles a complex topic quite succinctly, the cognitive load on the participant piled up.

For question designers, a cardinal sin includes “ands” or “ors.” When a conjunction joins two ideas, a “double barrel question” in the evocative term from the textbook, Empirical Methods in Law, by Lawless, Robert M., Robbennolt, Jennifer K. and Ulen, Thomas S., Wolters Kluwer, 2nd Ed. 2016 at 67,  it asks whether X and Y both are true. What if X is true but not Y, or Y but not X? How does a respondent answer half of a conjunction?

Feel the cognitive schism of a conjunction from the question asked in Gowling WLG Protectionism (2017) [pg. 13]. Some participants might believe that their sector is aware of the risks of protectionist policies but hasn’t prepared how to respond to them (i.e., the sector is on notice but not ready to act). What is the right answer for those participants?

Alternatively (or disjunctively), to the question whether X or Y is true, when the analysis step arrives, a firm can’t disentangle X from Y is since they have been annealed. X could be true and Y could be false, or the reverse.

We will close with one more example of both complexity and conjunction. [pg. 14] confronted respondents with seven selections, several of which were complex and one of which included a conjunction [the fourth from the top, “Breakdown … and the rise …”]. As with the Gowling question, this selection might leave a participant in a bind if one part of the selection holds true but not both parts.

Order of selections in multiple-choice questions

Since participants are expected to read all the selections of a multiple-choice question, the order in which you list them may seem of little moment. But the consequences of order can be momentous. Respondents might interpret the order as suggesting a priority or “correctness.” For example, if the choice that the firm thinks will chosen most commonly stands first, that decision will influence the data in a self-fulfilling pattern. The firm thinks it’s important — or, worse, would prefer to see more of that selection picked — and therefore puts it first, while respondents are influenced by supposing that privileging to be true and choose it.

Or participants may simple tire of evaluating a long list of selections and deciding which one or more to choose. They may unknowingly favor earlier choices so that they can declare victory and move on to the next question.

Let’s look at a question from the King & Spalding survey on claims professionals (2016) [pg. 15], not in any way to criticize the question but to illustrate the possibility of the skews described above.

We don’t know enough about claims professionals or lines of insurance to detect whether this selection order nudges respondents, but clearly the selections are not in alphabetical order. When selections appear in alphabetical order, the assumption is that the firm tried to randomize the order and thereby avoid guiding respondents.

Another option for a firm is to prepare multiple versions of the survey. Each version changes the order of selections of the key multiple-choice question or questions. The firm sends those variants randomly to the people invited to take the survey. So long as the text of the selections remains the same, the software that compiles results will not care about variations in selection order.

A more sophisticated technique to eliminate the risk of framing relies on the survey software to present the selections in random order for each survey taker. In other words, the order in which person A sees the selections is randomly different than the order in which person B sees the selections.

Published reports infrequently restate the exact question asked and never the arrangement of selections. All the reader has to go by is the data as reported in the text, table or graphic. Because the summary of the data usually starts with the most common selection and discusses the remaining results in declining order, the original arrangement of selections is not available.

For example, here is one multiple-choice question from Davies Ward Barometer (2010) [pg. 58]. At the top, the snippet provides the text of the report which gives a clue to the question asked of respondents. Nothing gives a clue about the order of the selections on the survey itself.

As an aside, consider that this survey followed several prior surveys on the same topic. It is possible that the order of the selections reflects prior responses to a similar question. That would be a natural thing to do, but it would be a mistake for the reasons described above.

Priority of demographic attributes (the four most common)

Having studied more than 70 survey reports by law firms, I sensed that the demographic attributes recognized by the firms exhibit a fairly consistent priority. First, and thus most importantly, firms focus on respondent position, then respondent company’s industry, business model and location. That priority order for the four demographic characteristics makes sense.

The rank of the person completing the survey suggests their depth of knowledge of the topic. You want general counsel giving their views more than junior lawyers who have just joined the company. You seek C-suite executives, not assistant directors. The level or position also signals the ability of the firm to reach decision makers and persuade them that their time is well spent taking the survey. Implicitly, a high proportion of busy leaders says “This topic has significance.”

Industry (sector) comes next on the priority list because legal issues impinge on each industry differently. Also, readers of a survey report not only want to know that it speaks for companies in their industry but also they also would like to see how the results differ industry by industry.

“Business size” is my term for the third-most-common demographic. The typical measure relies on the annual revenue of the company. Most surveys proudly state that they have a good number of large companies as those companies are more prestigious (and are probably the targets of business development efforts by the firm). A less common business size is number of employees. For non-profits and government agencies revenue has less relevance (budget may be the better metric), but all organizations have employees. Still measure often gives less insight for profit-seeking organizations as it can vary enormously across industries and indeed within industries.

The fourth-most-common demographic regards the geography of respondent organizations, either its country, region or continent. Quite a few surveys, however, collect only participants from a single country and therefore ignore this demographic attribute. [We did spot one survey that broke out respondent data by states in Australia.]

We chose three surveys to spot test the relative importance they attach to their demographics.

  • The Dykema Gossett survey of merger and acquisition specialists (2017) gathered data on the position of its respondents, the sector in which their company operates, and their company’s revenue. The firm’s report did not attach numbers of participants or percentages to any of the demographic attributes but it described them in that order.
  • The Carlton Fields survey of class actions (2014) likewise summarized its participants by position, sector and revenue, in that order, but disclosed nothing further.
  • Of the spot-tested reports, by far the best handling of demographics comes from the Baker McKenzie cloud survey (2017). That report precisely states breakdowns by geography, position, industry sector, and number of employees. Even better, the report includes plots that visualize these attribute details. Baker McKenzie described the position of of individual respondents with seven choices of functions (IT, Sales, Legal, etc.) but the firm did not provide revenue data. In other respects, however, it commendably shared the demographics of its survey population. The order of presentation was geography, position, and business model.  Interestingly, for geography the report uses a map to convey where their “top respondents” came from.

If we had full data on the treatment of demographic attributes by all the surveys available to us, our inductive sense of these priorities would be confirmed or overturned. Perhaps in another post. Meanwhile, note two points. First, which demographics are important depends on the purpose of the research. Second, the report ought to take advantage of the demographic data; to create analytic value, somewhere the report should break out the findings by demographic segments.

Techniques to reduce mistakes by respondents

What can a firm do to improve the likelihood that respondents answer multiple-choice questions correctly? The substance of their answer is known only to them, but some methodological trip-ups have solutions. To address the question, we can revisit the failure points that we presented above.

Reverse the scale. One step to identify a misreading asks a second question to confirm the first answer. So, if the first question asks for a “1” to indicate “wholly ineffective” on up to a “10” to indicate “highly effective,” a later question might present the choices and ask the respondent to pick the most effective one. If that choice did not get a high number (8, 9 or 10, probably) on the first question, you have spotted a potential scale reversal. If you decide to correct it, you can manually revise the ratings on the first question. Second, using different terms for the poles might improve accuracy, although at a cost of some consistency and clarity. Thus, the scale might be a “1” to indicate “wholly ineffective” on up to “10” to indicate “highly productive.” Respondents are more likely to notice the word or phrase variability and get the scale right.

Misread the question. Sometimes, next to the answer choices you can repeat the key word. Seeing the key word, such as “most inexpensive”, a respondent will catch his or her own misreading. As with scale reversals, here too a second question might confirm or call out an error. Alternatively, a firm might include a text box and ask the respondent to “briefly explain your reasoning.” That text might serve as a proof of proper reading of the question.

Misread selections. In addition to the remedies already discussed, another step available to a firm is to write the selections briefly, clearly, and with positives. “Negotiate fixed fees”, therefore, improves on “Don’t enter into billing arrangements based on standard hourly rates.” Furthermore, don’t repeat phrases, which can make selections look similar to a participant who is moving fast. “Negotiate fixed fees” might cause a stumble if it is followed by “Negotiate fixed service.”

Misread instructions. The best solution relies on survey software that rejects everything except numbers. That function should screen out the undesirable additions. The downside is that participants can grow frustrated at error messages if they do not tell them clearly the cause of their mistake: “Please enter numbers only, not anything else, such as letters or symbols like $.”

Fill in nonsense when answers are required. As mentioned, sophisticated software might detect anomalous selections, but that leads to dicey decisions about what to do. An easier solution is to keep the survey focused, restrict selections to likely choices (and thus fewer of them), and make them interesting. Sometimes surveys can put in a question or step that reminds participants to pay attention.

Give contradictory answers. Again, in hopes of trapping contradictions law firms can structure the question set to include confirmatory questions on key points. The drawback? A longer survey. Alternatively, some firms might email respondents and confirm that they meant to give answers that conflict with each other. Likewise, interviews after the survey comes back may smoke out corrections.

Become lazy. Keep the survey short, well-crafted, and as interesting as possible for the participant. Perhaps two-thirds of the way through a firm could ‘bury’ an incentive button: “Click here to get a $15 gift certificate.” Or a progress bar displayed by the survey software can boost flagging attention (“I’m close, let’s do a good job to the end….” .

Too quickly resort to “Other”. Despite the aspiration to achieve MECE (mutually exclusive, comprehensively exhaustive), keep selections short, few, and clear. Pretesting the question might suggest another selection or two. Additionally, a text box might reduce the adverse effects of promiscuous reliance on “Other”.

Irregular disclosure of demographics from one survey to the next in a series

How consistently do law firms track and disclose demographic data? Not very consistently, we found, based on three pairs of surveys conducted by different firms: Foley Lardner on Telemedicine in 2014 and the follow-up in 2017, Seyfarth Shaw on real estate in 2016 and the follow-up the year after, and Proskauer Rose on employment in 2016 and 2017. Before studying those survey pairs, I had thought that firms would stick pretty closely to the way they treated demographics in their first survey, perhaps modifying and improving them a little bit for the follow-on survey. Not true, not at all!

The plot below attempts to summarize how the second survey of each pair compares to the first survey with respect to demographics. Each bar has a segment for the five demographic attributes in the legend. Each segment can be a zero if the report does not include it, or a 1 if the report’s disclosure is minimal, on up to a five for a very good disclosure.

If a segment in the second column is higher than its counterpart in the first column, then the second survey improved on the first one. Perhaps it went from a 3 to a 4. Typically, that would mean the second report had breakdown with more categories or more information on percentages. For example in 2014 Foley Lardner (“Foley First” on the bottom axis) reported on five levels with percentages for each of respondent size (employees, or revenue), organization type, and position of the respondent. In the second report three years later, even with 50 more participants than in the first year, the firm combined two of those levels (and gave the percentage), but gave no other information. Thus, its first column’s segment for level (the second from the top, in light blue) starts as a five but drops in the firm’s second column (“Foley Second”) to a one. The first report did well on number of employees or revenue (red at the bottom) but that demographic information disappeared in the second survey report.

 

Taking another example, in its inaugural survey Proskauer Rose did not provide details about the locations of its respondents (as indicated by the absence of the light-yellow segment), but provided some information about that attribute in its subsequent survey report (the second segment from the top of the column labelled “Proskauer Second”).

Oddly, Foley & Lardner broke out three kinds of hospitals in its first year but combined them all in its second year. Two other categories of organization type matched, but two new ones appeared the second year.

The number at the bottom of each column tells how many participants that survey had. Hence, it is also odd that the three firms saw significant increases in the number of respondents year-over-year. However, they did choose to elaborate on their demographic reporting.

In other words, given the irregular disclosure of data about respondents on these five important attributes, it is difficult to know how well the two sets of respondents resemble each other.

Demographic data tailored for survey

We have written extensively about demographic data and how law firms report it from their research. Some kinds of demographic data figure prominently and consistently in reports, such as what we have termed the Big 4 demographics. But some surveys explore topics that justify other, one-off demographics. We show six of them below as displayed in reports issued by six different law firms.


HoganLovells, researching foreign direct investment [pg. 74], shows in the plot on the left 15 roles the firm asked about. On the right, White & Case’s research into arbitration collected demographic data about respondents’ legal background [pg. 52].

Moving from the two plots above to the two below, Davies Ward, interested in Canadian lawyers, sought answers on the years respondents had been practicing law [pg. 9]. Norton Rose Transport [pg. 2] turned to a donut plot to display 11 roles within four industries.

 

 

 

In the final two instances, in the left plot Winston Strawn looked at risk [pg. 31]. Its questionnaire asked not just about where the respondent’s company was based but also where the respondent individually was based .

And Foley Lardner studied telemedicine [pg. 10] in the first of a series, starting in 2014, and drilled down on types of healthcare organizations.

 

 

 

 

 

 

 

 

 

 

 

 

 

Law firm research surveys might ask for all kinds of background, profile data that illuminates their findings. It is easy to think of examples, such as patent records outstanding for research into intellectual property practices or the age of respondents for research into demographics.

Ten pitfalls of respondents on multiple-choice questions

Before plunging into the bog of blunders, let’s define respondent as someone who presses submit at the end of an online questionnaire. An alternative term would be participant. Potential respondents who stop before the end of the questionnaire are partial participants. Typically, survey software logs the responses of partial participants. Now, enter the bog, if ye dare!

We have listed below several things that can go wrong when people tackle multiple choice questions. The pictorial summarizes the points.

  1. Reverse the scale. With a question that asks for a numeric value, as in a table of actions to be evaluated on their effectiveness, a “1” checked might indicate “wholly ineffective” while a ten might indicate “highly effective.” Some people may confuse the scale of low to high and check a “1” when they mean “highly effective”.
  2. Misread the question. Hardly unique to multiple-choice questions, simple misunderstanding of the inquiry dogs all survey questions. If the question addresses “effective actions” and someone reads it as inquiring about “ineffective actions”, all is lost.
  3. Misread selections. This pitfall mirrors misreading questions, but applies to the multiple selections. Negative constructions especially bedevil people, as in “Doesn’t apply without exception.”
  4. Misread instructions. This mistake commonly appears when questions ask for a number\index{number answer}. Careful survey designers can plead with respondents to tell them “Only numerals, not percent signs or “percent”. The guidance can clearly state “do not write ranges such as “3-5” or “4 to 6”, do not add “approx..” or ” ~ .” For naught. Or people sprinkle in dollar signs or write “2 thousand” or “3K”. Humans have no trouble understanding such entries, but computers give up. If an entry is not in the right format for a number, a computer will treat the entry as a text string. Computers can’t calculate with text strings. Fortunately, computers can be instructed to scrub the answers so that they are in a standard format. And sometimes the survey software can check the format of what’s entered and flash a warning message.
  5. Fill in nonsense when answers are required. Some participants can’t be bothered to waste their time on irrelevant questions, so they slap in the first selection (or some random selection). Unless the analyst takes time to think about the likelihood of a given answer in light of other answers or facts, this mistake eludes detection.
  6. Give contradictory answers. Sometimes a survey has two questions that address a similar topic. For example, the survey might ask respondents to check the cost management techniques they have tried while a later question asks them to rate those techniques on effectiveness. What if they rate a technique they didn’t say they had tried, or they fail to rate a technique that they had tried? This could be a form of contradiction.
  7. Become lazy. When there are too many questions or selections for questions go on and on or reasonable answers require digging, respondents can throw in the towel and make sloppy selections. Here the fault lies more with the survey designer than with the survey taker.
  8. Too quickly resort to “Other”. A form of laziness, if the selections are many or complex, some people just click on “Other” rather than take the time to interpret the morass. If they write a bit about what “Other” means, that text will reduce the adverse effects of the lack of discipline.
  9. Mis-click on drop-downs. If you find a “United Emirates” in your corporate headquarters data and nearly everyone else is “United States”, you can suspect that one person made a mistake on the drop-down list.
  10. Pick too many or too few. If they pick too many selections, the software might give a warning. Otherwise, if “select no more than three” governs, the software might simply take the first three even if four or more were checked. The survey software should be able to give a warning if this mistake happens.