Turn Multi-Question Responses into Dummy Variable Matrix

One vital step in the analysis of a multi-choice question creates a variable for each potential selection. The dummy variable for each selection is coded “1” if the respondent checked it and “0” if not.

Think of a spreadsheet where each row holds a person’s answer to the question. If the only question they answered was the multi-choice question, they will have columns to the right of their name up to the number of selections, and in each column a “0” if they did not select that role and a “1” if they selected it. The sheet would have as many rows as respondents and each row would have a pattern of “0”s and “1”s corresponding to the options not selected or selected. All those “0”s and “1”s form a matrix, a rectangular array of numbers.

For an example of a “”check all that apply” question, a multi-choice question, the snippet below shows the results from respondents checking from six selections available. The percentage inside the top bar selection tells us that 62% of the respondents picked it, so a “1” showed up for that dummy variable. For the remaining 38% of the respondents, the column would have a “0”.

It is entirely possible to have software count the number of times each selection was checked, but analysts often decide to convert multi-choice responses into binary matrices, populated only with “0”s and “1”s, so that software can carry out more elaborate calculations. For a simple example, the binary matrix shown below has a “RowSum” column on the far right that added each “1” in the columns to the left. The first respondent selected two roles, Role1 and Role3, so “1”s are in those two cells and the “RowSum” equals 2.

Leave a Reply

Your email address will not be published. Required fields are marked *