Effect size in regression, and holding variables constant

Each coefficient of a regression model measures what is called effect size. Returning to our data, the effect size tells how much the predicted number of lawyers changes with a change of one F500 headquarters. Since our single-variable coefficient is 1,265 it would mean that every additional F500 headquarters increases the estimated lawyers by 1,265; every F500 headquarters less would drop the estimated lawyers by 1,265. So, effect size indicates the influence of one plus-or-minus predictor unit on the response number. But we shouldn’t use that single effect size because we have data for other variables that contribute to the estimated number of lawyers.

To this point we have been conducting linear regression with only a single predictor. If we include in our model more than one predictor, by the way, we are using multiple linear regression.

To progress to multiple linear regression and to see more effect sizes and how coefficients change when there are additional predictor variables, let’s add to our model the predictor variable of state population. The resulting regression equation appears below.

[1] “lawyers = -275.8 + 767.689 * F500 + 0.001 * population + e”

The coefficient for F500 headquarters has dropped dramatically, from 1,265 to 768.  Further, we see a tiny coefficient for state population.

But you can’t look at the absolute size of a coefficient and decide whether it’s more or less influential than another predictor’s coefficient, because if they are both statistically significant, you also must take into account the units of the predictor variable. Our units are one headquarters and one state resident. Intuitively, a change in one headquarters ought to make much more of a difference in the lawyer count than the change of one resident.

Both predictors are statistically significant, so what do the coefficients tell us about translating their variable values into the real world? On this two-predictor model, every increase or decrease in the number of F500 headquarters changes the estimated number of lawyers by 768, when we hold state population constant. Every increase or decrease in the population changes the estimated lawyers by 1/1000, holding F500 headquarters constant. Thus, for every thousand additional residents in a state, this model predicts an additional lawyer.

Holding other predictors constant means that the software sets the remaining predictor variables at the same value, so they have no influence on the response variable — the number of private practice lawyers. Doing so isolates the effect of the remaining predictor on the response variable.

We need to include as many predictor variables as we have available and evaluate that multiple linear regression model, which we will do in a later post.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.