Linear Regression Diagnostics
This paper attempts to provide the user of linear multiple regression with a battery of diagnostic tools to determine which, if any, data points have high leverage or influence on the estimation process and how these possibly discrepant data points differ from the patterns set by the majority of the data. The point of view taken is that when diagnostics indicate the presence of anomolous data, the choice is open as to whether these data are in fact unusual and helpful, or possibly harmful and thus in need of modifications or deletion. The methodology developed depends on differences, derivatives, and decompositions of basic regression statistics. There is also a discussion of how these techniques can be used with robust and ridge estimators. An example is given showing the use of diagnostic methods in the estimation of a cross-country savings rate model.
The authors would like to acknowledge helpful conversations with David Hoaglin, Frank Hampel, Richard Hill, David Andrews, Jim Franc, Cohn Mallows, Doug Martin, and Fred Schweppe.