AR1

Options     Examples     References

AR1 obtains estimates of a regression equation whose errors are serially correlated. These estimates are efficient if the disturbances in the equation follow an autoregressive process of order one. The estimates may be obtained using one of two different objective functions: exact maximum likelihood (which imposes stationarity by constraining the serial correlation coefficient to be between -1 and 1 and keeps the first observation for estimation), or by GLS, which drops the first observation.

AR1 (FAIR, INST=(list of instrumental variables), METHOD=CORC or HILU or ML or MLGRID,

         OBJFN= EXACTML or GLS, REI, RMIN=<minimum rho value>,

         RMAX=<maximum rho value>, RSTART=<start value for rho>,

         RSTEP=<step value for rho>, TSCS, nonlinear options)

        <dependent variable name> <list of independent variables> ;

To obtain estimates of a regression equation which are corrected for first order serial correlation, use the AR1 command as you would an OLSQ command. PDL (polynomial distributed lag) variables may be included in an AR1 statement. See the PDL section for a further description of how to specify these variables. TSP automatically deletes observations with missing values for one or more variables before estimation.

AR1 can also obtain estimates for a panel data model with fixed (TSCS) or random (REI) effects using exact ML.

Output

The AR1 procedure produces a large amount of printed output. The equation title and the chosen objective function are printed first. If the PRINT option is on, this is followed by the list of option values, the initial estimates for all the coefficients, iteration output for all coefficients, and any grid values for rho and the objective function.

The usual regression output follows, as described under the OLSQ command. The regression statistics are computed from the fitted values and residuals described below. If the objective function is GLS, a common factor test is printed. This test is a likelihood ratio test of the restrictions implied by AR(1) over the OLS model with lagged dependent and right hand side variables included. [The test is not well-defined when the model is estimated by ML due to the special treatment of the first observation].

As in OLSQ and INST, a table of coefficient estimates is printed. RHO is always the last coefficient in the table; its inclusion guarantees that the standard errors are always consistent, even if there are lagged dependent variables on the right hand side. The fitted values (@FIT) and residuals (@RES) are computed as follows:

AR1 also stores this regression output in data storage for later use. The table below lists the results available after an AR1 command. Note: the number of coefficients (# vars) always includes RHO ().

 variable

type

 length

 description

 @RNMS

list

 #vars

 Names of right hand side variables

 @LHV

list

1

 Name of the dependent variable

 @RHO

scalar

1

 Serial correlation parameter at convergence

 @SSR

scalar

1

 Sum of squared residuals

 @S

scalar

1

 Standard error of regression

 @YMEAN

scalar

1

 Mean of the transformed dependent variable

 @SDEV

scalar

1

 Standard deviation of the dependent variable

 @NOB

scalar

1

 Number of observations

 @DW

scalar

1

 Durbin-Watson statistic

 @RSQ

scalar

1

 R-squared

 @ARSQ

scalar

1

Adjusted R-squared

 @IFCONV

scalar

1

=1 if convergence achieved, =0 otherwise

 @LOGL

scalar

1

Log of likelihood function.

@COMFAC

scalar

1

Common factor test (if OBJFN=GLS)

 @COEF

vector

#vars

Coefficient estimates.

 @SES

vector

#vars

Standard Errors.

 @T

vector

#vars

 t-statistics.

 @VCOV

matrix

#vars*#vars

Variance-covariance of estimated coefficients.

 @RES

series

#obs

Fitted residuals from model.

 @FIT

series

#obs

Fitted values of dependent variable.

If the regression includes PDL variables, @SLAG, @MLAG, and @LAGF will also be stored (see OLSQ for details).

Method

AR1 uses an initial grid search to local possible multiple local optima (when OBJFN = GLS), and then iterates efficiently to a global optimum with second derivatives. The likelihood function and treatment of the initial observation are described completely in Davidson and MacKinnon (1993).

AR1 (REI) is similar to PANEL (REI), but with an added AR(1) component. The estimator follows Baltagi and Li (1991). It uses analytic second derivatives to obtain quadratic convergence and accurate t-statistics for all parameters (including RHO and RHO_I, the intraclass correlation coefficient, which can be negative).  

Options

FAIR/NOFAIR specifies whether the lagged dependent and independent variables are to be added to the instrument list automatically when doing instrumental variable estimation combined with a serial correlation correction.

INST= list of instrumental variables. This list should include any exogenous variables that are in the equation such as the constant or time trend, as well as any other variables you wish to use as instruments. After any instruments are added by the FAIR option, there must be at least as many instruments as the number of estimated coefficients (the number of independent variables in the equation, plus one for rho). OBJFN= GLS is implied; the actual objective function is E'PZ*E, where the Es are rho-transformed residuals. See the Examples for a way to reproduce the AR1 estimates with FORM and LSQ.

Fair once argued that the lagged dependent and independent variables must be in the instrument list to obtain consistent estimates when doing instrumental variable estimation with a serial correlation correction. TSP adds them automatically if you use the FAIR option (the default); if you want to specify a different list of instruments, you must suppress this feature with a NOFAIR option.

Fair retracted his claim in 1984; it has since been disproved by Buse (1989), but the alternative instruments for consistency involve pseudo-differencing with the estimated rho (Theil's G2SLS), which is tedious to perform by hand. Buse also showed that the asymptotically most efficient estimator in this case (S2SLS) includes the lagged excluded exogenous variables as well, but he cautions that in small samples this may quickly exhaust the degrees of freedom.

METHOD=ML or MLGRID or CORC or HILU was formerly used to specify the estimation algorithm. This is now specified by the OBJFN= option. METHOD=ML or MLGRID imply OBJFN=EXACTML, while METHOD=CORC or HILU imply OBJFN=GLS. METHOD=ML formerly used the Beach and McKinnon algorithm, while METHOD=CORC used the Cochrane-Orcutt algorithm. Now iterations are done using the Newton-Raphson alogrithm (HITER=N in the nonlinear options) which is quadratically convergent (about the same speed as Beach-MacKinnon, but much faster and more accurate than Cochrane-Orcutt). METHOD=HILU refers to Hildreth-Lu, a simple grid search method.

OBJFN=EXACTML or GLS specifies the objective function. EXACTML retains the first observation and includes the Jacobian term log(1-rho**2), which guarantees stationarity. GLS drops the first observation and does not impose stationarity. It is the same as nonlinear least squares on a rho-differenced equation, and can also be described as Aconditional ML@ (conditional on the initial residual).

EXACTML is the usual default, but if there is a lagged dependent variable on the right-hand side, GLS becomes the default, because EXACTML has a small-sample bias in this case.

GLS uses an initial grid search to locate starting values and potential multiple local optima. It is well known that multiple local optima can occur for GLS, especially when there are lagged dependent variables. Multiple optima are noted in the output if they are detected. AR1 then iterates efficiently to locate an accurate global optimum. EXACTML normally skips the grid search, because no cases of multiple local optima are known when the Jacobian is included. METHOD=MLGRID will turn on this grid search.

REI/NOREI specifies that an AR(1) model with panel random effects is to be estimated by means of maximum likelihood.

RMIN= specifies the minimum value of the serial correlation parameter rho for the initial grid search (when OBJFN=GLS or METHOD=MLGRID are used). The default value is -0.9.

RMAX= specifies the maximum value of rho for the grid search methods. The default value is 1.05 for OBJFN=GLS, or .95 for METHOD=MLGRID.

RSTEP= specifies the increment to be used in the grid search over rho. The default value is 0.1, until rho=.8. Then the values .85, .9, .95 are used, plus .9999, 1.0001, and 1.05 when OBJFN=GLS. These last 3 values help to detect optima with rho > 1, which are usually not reached during iterations when rho starts below 1.

RSTART= specifies a starting value of rho for the iterative methods. Ordinarily zero is used for OBJFN=EXACTML, but faster convergence may be achieved if a value closer to the true answer is chosen. RSTART can also be used to override the default grid search for OBJFN=GLS, but multiple local optima would not be detected.

TSCS/NOTSCS specifies EXACTML estimation for time series-cross section data when the FREQ (PANEL) command is in effect (then TSCS is the default) or when SMPL gaps have been set up to separate the cross section units (see the example below). OBJFN=GLS is not implemented for panel data.

(Obsolete) WEIGHT= is a former AR1 option which is no longer supported. The ML or LSQ commands should be used instead to implement a weight.

Nonlinear options are described under NONLINEAR in this manual. HITER=N/HCOV=N (second derivatives, the default) and G (first derivatives) are both available. MAXIT=0 can be used to avoid iterations and to perform a simple grid search without the additional accuracy of iterations. Also, AR1 uses a special default TOL=1E-6 (.000001).

Examples

This example estimates the consumption function for the illustrative model with a serial correlation correction, first using the maximum likelihood method, and then searching over rho to verify that the likelihood is unimodal in the relevant range.

AR1 (PRINT) CONS C GNP ;

AR1 (METHOD=MLGRID, RSTEP=0.05) CONS C GNP ;

The next three estimations are exactly equivalent and demonstrate the FAIR option with instrumental variables:

SMPL 11,50;

AR1 (INST=(C,G,TIME,LM)) CONS C GNP ;

AR1 (NOFAIR,INST=(C,G,TIME,LM,GNP(-1),CONS(-1))) CONS C GNP;

FORM(NAR=1,PARAM,VARPREF=B) EQAR1 CONS C GNP;

? Drop first observation, to compare with AR1(OBJFN=GLS) results.

SMPL 12,50;

LSQ(INST=(C,G,TIME,LM,GNP(-1),CONS(-1))) EQAR1;

Lagged dependent variable (default OBJFN=GLS, since EXACTML has a small sample bias):

AR1 CONS C GNP CONS(-1);

Time series-cross section with 10 years of data and 3 cross section units, and fixed effects:

SMPL 1,30;

FREQ (PANEL,T=10);

TREND OBS;

FIRM = 1 + INT((OBS-1)/10);

? Create FIRM1-FIRM3 dummy variables, stored in FIRMS list

DUMMY FIRM FIRMS;

AR1 SALES FIRMS ADV POP GNP ;

References

Baltagi, B. and Li, Journal of Econometrics, 1991.

Beach, Charles M. and MacKinnon, James G., "A Maximum Likelihood Procedure for Regression with Autocorrelated Errors," Econometrica 46, 1978, pp. 51-58.

Buse, A., "Efficient Estimation of a Structural Equation with First Order Autocorrelation," Journal of Quantitative Economics 5, January 1989, pp. 59-72.

Cochrane, D. and Orcutt, G. H., "Application of Least Squares Regression to Relationships Containing Autocorrelated Error Terms," JASA 44, 1949, pp. 32-61.

Cooper, J. Phillip, “Asymptotic Covariance Matrix of Procedures for Linear Regression in the Presence of First Order Autoregressive Disturbances,” Econometrica 40(1972), pp. 305 310.

Davidson, Russell, and MacKinnon, James G., Estimation and Inference in Econometrics, Oxford University Press, New York, NY, 1993, Chapter 10. (This is the best single reference)

Dufour, J-M, Gaudry, M. J. I., and Liem, T. C., "The Cochrane-Orcutt Procedure: Numerical Examples of Multiple Admissible Minima," Economics Letters 6, 1980, pp. 43-48.

Fair, Ray C., "The Estimation of Simultaneous Equation Models with Lagged Endogenous Variables and First Order Serially Correlated Errors," Econometrica 38, 1970, pp. 507-516.

Fair, Ray C., Specification, Estimation and Analysis of Macroeconomic Models, Harvard University Press, Cambridge, MA, 1984.

Hildreth, C. and Lu, J. Y., "Demand Relations with Autocorrelated Disturbances," Research Bulletin 276, Michigan State University Agricultural Experiment Station, 1960.

Judge et al, The Theory and Practice of Econometrics, John Wiley & Sons, New York, 1981, Chapter 5.

Maddala, G. S., Econometrics, McGraw Hill Book Company, New York, 1977, pp. 274-291.

Pindyck, Robert S., and Rubinfeld, Daniel L., Econometric Models and Economic Forecasts, McGraw Hill Book Company, New York, 1976, pp. 106-120.

Prais, S. J. and Winsten, C. B., "Trend Estimators and Serial Correlation," Cowles Commission Discussion Paper No. 373, Chicago, 1954.

Rao, P. and Griliches, Z., "Small Sample Properties of Several Two-Stage Regression Methods in the Context of Auto-Correlated Errors," JASA 64, 1969, pp. 253-27