# Lack-of-fit Sum of Squares - Sketch of The Idea

Sketch of The Idea

In order for the lack-of-fit sum of squares to differ from the sum of squares of residuals, there must be more than one value of the response variable for at least one of the values of the set of predictor variables. For example, consider fitting a line

by the method of least squares. One takes as estimates of α and β the values that minimize the sum of squares of residuals, i.e., the sum of squares of the differences between the observed y-value and the fitted y-value. To have a lack-of-fit sum of squares that differs from the residual sum of squares, one must observe more than one y-value for each of one or more of the x-values. One then partitions the "sum of squares due to error", i.e., the sum of squares of residuals, into two components:

sum of squares due to error = (sum of squares due to "pure" error) + (sum of squares due to lack of fit).

The sum of squares due to "pure" error is the sum of squares of the differences between each observed y-value and the average of all y-values corresponding to the same x-value.

The sum of squares due to lack of fit is the weighted sum of squares of differences between each average of y-values corresponding to the same x-value and the corresponding fitted y-value, the weight in each case being simply the number of observed y-values for that x-value. Because it is a property of least squares regression that the vector whose components are "pure errors" and the vector of lack-of-fit components are orthogonal to each other, the following equality holds: begin{align} &sum (text{observed value} - text{fitted value})^2 & & text{(error)} \ &qquad = sum (text{observed value} - text{local average})^2 & & text{(pure error)} \ & {} qquadqquad {} + sum text{weight}times (text{local average} - text{fitted value})^2. & & text{(lack of fit)} end{align} " src="http://upload.wikimedia.org/math/3/a/5/3a5f0646cf7035ce103b5a4fe4ac2d99.png" />

Hence the residual sum of squares has been completely decomposed into two components.