Your software that creates a scatter plot of a single predictor variable and the response variable can also create a best-fit line. Based on our data, the plot below shows such a line. The plot sorts the Fortune 500 figures from the lowest on the left to the highest on the right.

Simplifying somewhat, such a line makes the distance between it and each of the data points as small as possible. In other words, the software minimizes the total of the vertical distances between the data points and the best-fit line.

Now, do you remember that every straight line on a graph can be stated as an equation (“slope” equals “rise-over-run,” the amount of vertical change as the sorted horizontal numbers changes)? When the best-fit line is calculated, the software figures out the equation and thereby produces the so-called coefficients for the regression equation.

While doing so, the software also calculates the differences between the actual points and their respective positions vertically above or below on the best-fit line, called the error (or the residual). The plot shows an example of the distance for Florida, with approximately 43,400 lawyers and 16 Fortune 500 headquarters. Errors for points above the line are negative; errors for points below the line are positive; all the errors added together equal zero.

For our regression example, the equation says that the estimated number of private practice lawyers in a state is equal to a coefficient called the intercept, which is where the best-fit line crosses the vertical axis, plus some number multiplied by the number of Fortune 500 headquarters (often the intercept has no real-world meaning, because it assumes all the predictor variables equal zero, which is unlikely). The next post will flesh out that sentence.