Broad selections challenge designers of multiple-choice questions

When you write non-numeric selections of a multiple-choice question, you want them to be different from each other and cover the likely choices as completely as possible. Yet at the same time you don’t want too many selections. You also would like them to be close to the same length. We have compiled other good practices.

The selections also need to be quickly and decisively understood by respondents.  Respondents don’t want to puzzle over meanings and coverage of terms. Partly that means you need to cure ambiguities but partly it means to choose terms in selections carefully so that nearly everyone interprets them the same way at first reading.

We found an instructive example in one of the law-firm research surveys. Did the questions in the plot below achieve quick clarity?  We do not know if the left-hand-side labels mirror the selections on the questionnaire. Some surveys have more detail, and even explanations, but the report gives an abbreviation of the selection.

I wonder whether most of the general counsel understand “Partner ecosystem”, let alone in the same way. Should there be a two notions joined as in “New sources of revenue and new business models”? Some companies might pursue revenue or a new business model, but not both. Likewise, why pair “Clean energy and environmental regulation”? They could be seen as two separate trends. The selection “Geopolitical shifts” feels so broad that it invites all kinds of interpretations by respondents.

This question challenged the survey designors with an impossible task. First they had to pick the important trends — and what happened to “Demographic changes”, “Big data”, “Urbanization” and “Taxes” to pick a few others that could have been included? Second, they had to describe those multifaceted, complex trends in only a few words. Third, those few words needed to fix a clear picture in lots of minds, or else the resulting data represents a blurred and varied clump of subjective impressions.

Four reasons why demographic questions usually lead off a survey

By convention, the first few pieces of information asked of respondents on a questionnaire typically concern demographic facts (title, industry, location, revenue). The reasons for this typical order might be termed psychological, motivational, practical, and instrumental.

Psychologically, law firms want to know about the person who is providing them data. Is this person higher or lower in the corporate hierarchy? Does this person work in an industry that matters to the firm or matters to the survey results? They want to know that the person is credible, knowledgeable, and falls into categories that are appropriate for the survey. To satisfy that felt need, designers of questionnaires put demographic questions first.

When a questionnaire starts with questions that are easy to answer, such as regarding the respondent’s position, the industry of their company, and its headquarters location, it motivates the respondent to breeze through them and charge on. They sense that the survey is going to be doable and quick. Putting the demographic questions first, therefore, can boost both participation rates and attrition rates.

A practical reason to place the demographic questions at the start is that doing so allows the survey software to filter out or redirect certain respondents. If an early question concerns the level of the respondent, and if their choice falls below the firm’s desired level of authority, the survey can either thank the respondent and close at that point or move their subsequent questions to a different path. Vendors who conduct surveys often cull out inappropriate participants, but law firms rarely take this step. Rather, they usually want as much data as they can get from as many people as will take part.

Fourth, if the demographic questions are at the start of the questionnaire, then even if the participant fails to complete the survey or submit it, it may be possible that the survey software captures valuable information. This could be thought of as a instrumental reason for kicking off a questionnaire with demographic questions. These days, the law firm particularly wants to know the email address of the participant and their title. That information probably flows into a customer relationship management (CRM) database.

Four techniques to make selections more clear

When someone creates a multiple-choice question, they should give thought to where and how to explain the question’s selections. People spend time wordsmithing the question, which is valuable time, but not the end of the matter. Even the invitation to survey participants may explain some background and key terms that shed light on selections. But at least four other options present themselves in the service of selections that can be answered without interpretative complexity.

First, a firm’s survey software should allow the designer to place an explanatory section before a question or series of related questions. That section can elaborate on what follows and guide readers in choosing among the selections. This technique has been overlooked in many of the questionnaires done for law firm research surveys.

Second, the question itself can be written carefully so that participants more easily understand the selections that follow. [This is not referring to directions such as “check all that apply” or “pick the top 3.” The point here pertains to interpretation and meaning of the multiple choices.] For example, the question might make clear the period for which answers should be given covers the previous five years. Or the question might define “international arbitration” in a certain way to distinguish it from “domestic arbitration.” The overarching definitions and parameters laid out in the question shape inform each of the selections that follow.

Third, as a supplement to the main question, some survey software enables the designer to add instructions. Using NoviSurvey, for instance, the instructions appear below the question in a box, and offer additional explanatory text. Instructions commonly urge participants not to put in dollar signs or text in a numeric field or to enter dates in a specific format, but they can also explain the selections. For example, the instructions might note that the first four selections pertain to one general topic and the next four selections pertain to a second topic. Or the instructions might differentiate between two of the selections that would otherwise perhaps be confused or misconstrued.

Finally, even if there is no explanatory section, guidelines from the question itself, or illumination in instructions, the selections themselves can embed explanatory text. Any time a selection has an “i.e.,” or an “e.g.,” the person picking from the selections should be able to understand them better. Sometimes a question will say “… (excluding a selection shown above)” to delineate two choices.

As a by-product, the more you expand on the selection choices, the more you can abbreviate them. The interplay between these four techniques to disambiguate selections, to present them more directly and clearly, allows careful designers of questions to craft selections more precisely and usefully.

Advisable to use “Don’t know” or “NA” in multiple-choice questions

Well-crafted multiple-choice questions give respondents a way to say that they don’t know the answer or that no selection applies to their situation. The two non-answers differ in that ignorance of the answer — or, possibly refusal to give a known answer — can be remedied by the respondent whereas they can’t supplement an incomplete set of selections. Firms should not want people they have invited to take a survey to have to pick the least bad answer when their preferred answer is missing. As we have written before firms should add an “Other” choice with a text box for elaboration.

From HoganLovells Cross-Border 2014 [pg. 19] comes an example of how a multiple-choice question accommodates respondents who don’t know the answer. Also, it shows how data from such a question might be reported in a polar graphic. Seven percent of the respondents did not know whether their company’s international contracts include arbitration procedures.

In the jargon of data analysts, a “Don’t know” is called item non-response: no answer is given to a particular survey item when at least one valid answer was given to some item by the same respondent, e.g., leaving an item on a questionnaire blank, or responding to some questions by saying, “I don’t know,” while providing a valid response to other questions.

Another survey, DLA Piper Compliance 2017 [pg. 15], used a “Does not apply” option. Almost one-third of the respondents checked it. It is conceivable that some respondents did not know the answer and resorted to denying its applicability to them as the best of the three choices, although far from optimal.

One more example, this time from Fulbright Jaworski Lit 2009 [pg. 61]. Here, one-fifth of those who took the survey indicated that they didn’t know the answer to the question reproduced on top of the plot.

It is easy to include variations of the non-substantive selections described above. In fact, extrapolating from these three instances, firms probably should do so since significant numbers of respondents might pick them — on average almost one out of five in the above surveys.

Multiple-choice questions dominate the formats of questions asked

Having examined more than 100 reports published by law firms based on the surveys they sponsored, I suspected that more than three out of four questions asked on the surveys fell into the category of multiple choice. Reluctant to confirm that sense by laboriously trying to categorize all the questions in all those surveys, I invited my trusty R software select five of the surveys at random.

Sure enough, all five not only exceeded the perception of at least 75% of the questions being multiple choice, but in fact every single question that could be identified from the five reports fell into that format! Bear in mind that we can’t be certain about all the questions asked on the surveys, but we can glean from the reports most of them. It would be necessary to count from the actual questionnaire to confirm this data.

Specifically, Seyfarth Shaw Future 2017 went eight for eight, Morrison Foerster MA 2014 was five out of five, and Berwin Leighton Arbvenue 2014 used multiple-choice questions for all of its at least 14 questions (it is difficult to figure out from the Berwin report exactly how many questions were on the survey). In Foley Lardner Telemedicine 2014, all twelve questions (include three demographic questions) were multiple choice; with Foley Lardner Cars 2017, all 16 questions were multiple choice (including two demographic questions).

Of those 55 multiple choice questions, a few presented binary choices but most of them presented a list of 4-to-7 selections to pick from. Likert scales appeared rarely, as illustrated in the plot below from Foley Lardner Cars 2017 [pg. 5]. The scale ranges from “Strongly Agree” to “Strongly Disagree.”

Morrison Foerster MA 2014 [pg. 4] also used a Likert scale in a question.

Multiple-choice questions that ask for a ranking can yield deeper insights

If you want to capture more information than you can from simple multiple choice questions, then a ranking question might be best for you. For one of its questions, Berwin Leighton Risks (2014) [pg. 17] presented respondents with seven legal risks. The instructions told the respondents to rank the risks from 1 to 8 (where 1 was the most serious risk and 8 the least serious). [Right, 8 ranking choices for only 7 items!] Presumably no ties were allowed (which the survey software might have enforced.)

The report’s plot extracted the distribution of rankings only for the two most serious, 1 or 2. It appears that the plot tells us, for example, that 48 respondents ranked “Legislation/regulation” as a 1 or 2 legal risk (most serious). Other plots displayed the distribution of 3 and 4 rankings and less serious rankings.

A ranking question, especially one with as many as seven elements to be compared to each other, burdens participants, because to answer it conscientiously they need to consider each element relative to all the others. As a surveyor, you can never completely rely on this degree of respondent carefulness.

But ranking questions can yield fruitful analytics. Rankings are far more insightful than “pick the most serious [or whatever criterion],” which tosses away nearly all comparative measures. Rankings are more precise than “pick all that are serious,” which surrenders most insights into relative seriousness. Yet the infrequency of ranking questions in the law-firm research survey world is striking. Findings would be much more robust if there were more ranking questions.

Some people believe that rankings are difficult analyze and interpret. The visualization technique of Berwin Leighton that presents different views of the aggregate rankings belies that belief. Many other techniques exist to analyze and picture ranking responses.

A ranking question gives a sense of whether a respondent likes one answer choice more than another, but it doesn’t tell how much more. A question that asks respondents to allocate 100 percent among their choices not only ranks the choices but differentiates between them much more precisely than simple ranking. Proportional distribution questions, however, appear in law firm surveys even less than ranking questions. In fact, we could not find one among the hundreds of plots we have examined. Perhaps the reason is that these questions are even more complicated to explain to survey participants.

Multiple-choice questions put a premium on simplicity and clarity

The best surveys present questions that participants understand immediately. Short, clear and with familiar words — that’s the secret to reliable answers and to participants continuing on with the survey. Much more than fill-in-the-blank questions or give-your-answer questions, multiple-choice questions especially need simple and direct because participants have to absorb the question first and then slog through some number of selections.

Some questions demand quite a bit from the participant. What is the complexity level of the question shown at the top of the image below, taken from Pinsent Masons TMT 2016 [pg. 20]? The person tackling that question had to juggle the broadness of “considerations,” bring to mind comprehensive knowledge of the company’s dispute resolution policy (recalling the meaning of “DR”), and apply both sensibilities in the context of arbitrations. Even though this question handles a complex topic quite succinctly, the cognitive load on the participant piled up.

For question designers, a cardinal sin includes “ands” or “ors.” When a conjunction joins two ideas, a “double barrel question” in the evocative term from the textbook, Empirical Methods in Law, by Lawless, Robert M., Robbennolt, Jennifer K. and Ulen, Thomas S., Wolters Kluwer, 2nd Ed. 2016 at 67,  it asks whether X and Y both are true. What if X is true but not Y, or Y but not X? How does a respondent answer half of a conjunction?

Feel the cognitive schism of a conjunction from the question asked in Gowling WLG Protectionism (2017) [pg. 13]. Some participants might believe that their sector is aware of the risks of protectionist policies but hasn’t prepared how to respond to them (i.e., the sector is on notice but not ready to act). What is the right answer for those participants?

Alternatively (or disjunctively), to the question whether X or Y is true, when the analysis step arrives, a firm can’t disentangle X from Y is since they have been annealed. X could be true and Y could be false, or the reverse.

We will close with one more example of both complexity and conjunction. [pg. 14] confronted respondents with seven selections, several of which were complex and one of which included a conjunction [the fourth from the top, “Breakdown … and the rise …”]. As with the Gowling question, this selection might leave a participant in a bind if one part of the selection holds true but not both parts.

Order of selections in multiple-choice questions

Since participants are expected to read all the selections of a multiple-choice question, the order in which you list them may seem of little moment. But the consequences of order can be momentous. Respondents might interpret the order as suggesting a priority or “correctness.” For example, if the choice that the firm thinks will chosen most commonly stands first, that decision will influence the data in a self-fulfilling pattern. The firm thinks it’s important — or, worse, would prefer to see more of that selection picked — and therefore puts it first, while respondents are influenced by supposing that privileging to be true and choose it.

Or participants may simple tire of evaluating a long list of selections and deciding which one or more to choose. They may unknowingly favor earlier choices so that they can declare victory and move on to the next question.

Let’s look at a question from the King & Spalding survey on claims professionals (2016) [pg. 15], not in any way to criticize the question but to illustrate the possibility of the skews described above.

We don’t know enough about claims professionals or lines of insurance to detect whether this selection order nudges respondents, but clearly the selections are not in alphabetical order. When selections appear in alphabetical order, the assumption is that the firm tried to randomize the order and thereby avoid guiding respondents.

Another option for a firm is to prepare multiple versions of the survey. Each version changes the order of selections of the key multiple-choice question or questions. The firm sends those variants randomly to the people invited to take the survey. So long as the text of the selections remains the same, the software that compiles results will not care about variations in selection order.

A more sophisticated technique to eliminate the risk of framing relies on the survey software to present the selections in random order for each survey taker. In other words, the order in which person A sees the selections is randomly different than the order in which person B sees the selections.

Published reports infrequently restate the exact question asked and never the arrangement of selections. All the reader has to go by is the data as reported in the text, table or graphic. Because the summary of the data usually starts with the most common selection and discusses the remaining results in declining order, the original arrangement of selections is not available.

For example, here is one multiple-choice question from Davies Ward Barometer (2010) [pg. 58]. At the top, the snippet provides the text of the report which gives a clue to the question asked of respondents. Nothing gives a clue about the order of the selections on the survey itself.

As an aside, consider that this survey followed several prior surveys on the same topic. It is possible that the order of the selections reflects prior responses to a similar question. That would be a natural thing to do, but it would be a mistake for the reasons described above.

Priority of demographic attributes (the four most common)

Having studied more than 70 survey reports by law firms, I sensed that the demographic attributes recognized by the firms exhibit a fairly consistent priority. First, and thus most importantly, firms focus on respondent position, then respondent company’s industry, business model and location. That priority order for the four demographic characteristics makes sense.

The rank of the person completing the survey suggests their depth of knowledge of the topic. You want general counsel giving their views more than junior lawyers who have just joined the company. You seek C-suite executives, not assistant directors. The level or position also signals the ability of the firm to reach decision makers and persuade them that their time is well spent taking the survey. Implicitly, a high proportion of busy leaders says “This topic has significance.”

Industry (sector) comes next on the priority list because legal issues impinge on each industry differently. Also, readers of a survey report not only want to know that it speaks for companies in their industry but also they also would like to see how the results differ industry by industry.

“Business size” is my term for the third-most-common demographic. The typical measure relies on the annual revenue of the company. Most surveys proudly state that they have a good number of large companies as those companies are more prestigious (and are probably the targets of business development efforts by the firm). A less common business size is number of employees. For non-profits and government agencies revenue has less relevance (budget may be the better metric), but all organizations have employees. Still measure often gives less insight for profit-seeking organizations as it can vary enormously across industries and indeed within industries.

The fourth-most-common demographic regards the geography of respondent organizations, either its country, region or continent. Quite a few surveys, however, collect only participants from a single country and therefore ignore this demographic attribute. [We did spot one survey that broke out respondent data by states in Australia.]

We chose three surveys to spot test the relative importance they attach to their demographics.

  • The Dykema Gossett survey of merger and acquisition specialists (2017) gathered data on the position of its respondents, the sector in which their company operates, and their company’s revenue. The firm’s report did not attach numbers of participants or percentages to any of the demographic attributes but it described them in that order.
  • The Carlton Fields survey of class actions (2014) likewise summarized its participants by position, sector and revenue, in that order, but disclosed nothing further.
  • Of the spot-tested reports, by far the best handling of demographics comes from the Baker McKenzie cloud survey (2017). That report precisely states breakdowns by geography, position, industry sector, and number of employees. Even better, the report includes plots that visualize these attribute details. Baker McKenzie described the position of of individual respondents with seven choices of functions (IT, Sales, Legal, etc.) but the firm did not provide revenue data. In other respects, however, it commendably shared the demographics of its survey population. The order of presentation was geography, position, and business model.  Interestingly, for geography the report uses a map to convey where their “top respondents” came from.

If we had full data on the treatment of demographic attributes by all the surveys available to us, our inductive sense of these priorities would be confirmed or overturned. Perhaps in another post. Meanwhile, note two points. First, which demographics are important depends on the purpose of the research. Second, the report ought to take advantage of the demographic data; to create analytic value, somewhere the report should break out the findings by demographic segments.

Techniques to reduce mistakes by respondents

What can a firm do to improve the likelihood that respondents answer multiple-choice questions correctly? The substance of their answer is known only to them, but some methodological trip-ups have solutions. To address the question, we can revisit the failure points that we presented above.

Reverse the scale. One step to identify a misreading asks a second question to confirm the first answer. So, if the first question asks for a “1” to indicate “wholly ineffective” on up to a “10” to indicate “highly effective,” a later question might present the choices and ask the respondent to pick the most effective one. If that choice did not get a high number (8, 9 or 10, probably) on the first question, you have spotted a potential scale reversal. If you decide to correct it, you can manually revise the ratings on the first question. Second, using different terms for the poles might improve accuracy, although at a cost of some consistency and clarity. Thus, the scale might be a “1” to indicate “wholly ineffective” on up to “10” to indicate “highly productive.” Respondents are more likely to notice the word or phrase variability and get the scale right.

Misread the question. Sometimes, next to the answer choices you can repeat the key word. Seeing the key word, such as “most inexpensive”, a respondent will catch his or her own misreading. As with scale reversals, here too a second question might confirm or call out an error. Alternatively, a firm might include a text box and ask the respondent to “briefly explain your reasoning.” That text might serve as a proof of proper reading of the question.

Misread selections. In addition to the remedies already discussed, another step available to a firm is to write the selections briefly, clearly, and with positives. “Negotiate fixed fees”, therefore, improves on “Don’t enter into billing arrangements based on standard hourly rates.” Furthermore, don’t repeat phrases, which can make selections look similar to a participant who is moving fast. “Negotiate fixed fees” might cause a stumble if it is followed by “Negotiate fixed service.”

Misread instructions. The best solution relies on survey software that rejects everything except numbers. That function should screen out the undesirable additions. The downside is that participants can grow frustrated at error messages if they do not tell them clearly the cause of their mistake: “Please enter numbers only, not anything else, such as letters or symbols like $.”

Fill in nonsense when answers are required. As mentioned, sophisticated software might detect anomalous selections, but that leads to dicey decisions about what to do. An easier solution is to keep the survey focused, restrict selections to likely choices (and thus fewer of them), and make them interesting. Sometimes surveys can put in a question or step that reminds participants to pay attention.

Give contradictory answers. Again, in hopes of trapping contradictions law firms can structure the question set to include confirmatory questions on key points. The drawback? A longer survey. Alternatively, some firms might email respondents and confirm that they meant to give answers that conflict with each other. Likewise, interviews after the survey comes back may smoke out corrections.

Become lazy. Keep the survey short, well-crafted, and as interesting as possible for the participant. Perhaps two-thirds of the way through a firm could ‘bury’ an incentive button: “Click here to get a $15 gift certificate.” Or a progress bar displayed by the survey software can boost flagging attention (“I’m close, let’s do a good job to the end….” .

Too quickly resort to “Other”. Despite the aspiration to achieve MECE (mutually exclusive, comprehensively exhaustive), keep selections short, few, and clear. Pretesting the question might suggest another selection or two. Additionally, a text box might reduce the adverse effects of promiscuous reliance on “Other”.