Turn Multi-Question Responses into Dummy Variable Matrix

One vital step in the analysis of a multi-choice question creates a variable for each potential selection. The dummy variable for each selection is coded “1” if the respondent checked it and “0” if not.

Think of a spreadsheet where each row holds a person’s answer to the question. If the only question they answered was the multi-choice question, they will have columns to the right of their name up to the number of selections, and in each column a “0” if they did not select that role and a “1” if they selected it. The sheet would have as many rows as respondents and each row would have a pattern of “0”s and “1”s corresponding to the options not selected or selected. All those “0”s and “1”s form a matrix, a rectangular array of numbers.

For an example of a “”check all that apply” question, a multi-choice question, the snippet below shows the results from respondents checking from six selections available. The percentage inside the top bar selection tells us that 62% of the respondents picked it, so a “1” showed up for that dummy variable. For the remaining 38% of the respondents, the column would have a “0”.

It is entirely possible to have software count the number of times each selection was checked, but analysts often decide to convert multi-choice responses into binary matrices, populated only with “0”s and “1”s, so that software can carry out more elaborate calculations. For a simple example, the binary matrix shown below has a “RowSum” column on the far right that added each “1” in the columns to the left. The first respondent selected two roles, Role1 and Role3, so “1”s are in those two cells and the “RowSum” equals 2.

Table-style questions efficiently gather data

Sometimes you want to collect the same information about several observations. Perhaps you want to know from the law departments that reply their number of lawyers and number of business locations in each of nine regions of the United States. You could ask a succession of nine identical questions, varying them only by region. Or, better, you could create a single matrix question (also known as a “table question”) for them to complete. The matrix style is more efficient and easier for respondents to fill in. They can tab through the table, for example, and everything is in one place.

To give an idea of what a table question looks like on a questionnaire, here is an example from the General Counsel Metrics, LLC benchmark survey of law departments (delivered online using NoviSurvey). The long question itself as well as the lengthy instructions in italics below it have been truncated on the right side so that the snippet focuses on the matrix.

Once your survey is closed and your task is to prepare the report, you can present the data collected in the table by various formats. We show one of them in the snippet below, which we presume comes from a seven row table that had three columns: the area of law of the class actions, the number of cases faced over some period of time, and the amount spent. This plot aggregated all that data.

Any of the companies that provide survey software, such as SurveyGizmo and Zoho,   can support matrix questions. To that end, in its useful e-book on surveys, SurveyGizmo notes two good points: “If you need to use several tables to gather data, make sure you split them up into topic-driven sections. Separating them with other less fatiguing questions can also help maintain engagement and data integrity.” Stated differently, a matrix question should focus on one topic, not mix in questions on different topics.