An important concept in machine learning involves the ratio between the number of observations and the number of predictors: **degrees of freedom**. For multiple linear regression, the degrees of freedom equals the number of observations minus one more than the number of predictors fit by the model. On our data set, using three predictors, we have 46 degrees of freedom: 50 states minus three variables plus the intercept.

With linear regression, statisticians want the degrees of freedom to be large compared to the number of predictor variables to avoid **over-fitting**. You lose one degree of freedom for every coefficient fitted, so if we had 49 different predictor variables we would have no degrees of freedom and a horribly over-fitted model. It would be useless for interpretation and for prediction.

Aside from over-fitting, here are two results of fitting a linear regression model where degrees of freedom play a role. One of the calculations that results from fitting a regression model is R-squared. It gives a sense of the portion of the response estimates explained by the model. When you add more variables to a regression formula, Adjusted R-squared increases only if the new predictor variables improve the model more than you would expect by chance. So Adjusted R-squared takes into account degrees of freedom.

A second place where linear regression uses degrees of freedom are F-test statistics, which we explain later.