COINT

Options     Examples     References

COINT performs unit root and cointegration tests. These may be useful for choosing between trend-stationary and difference-stationary specifications for variables in time series regressions. See Davidson and MacKinnon (1993) for an introduction and comprehensive exposition of these concepts. Most of these tests can be done with OLSQ and CDF commands on a few simple lagged and differenced variables, so the main function of COINT is to summarize the key regression results concisely and to automate the selection of the optimal number of lags.

COINT (ALL, ALLORD, COINT, CONST, DF, EG, FINITE, JOH,

              MAXLAG=<number of lags>,MINLAG=<number of lags>, PP, PRINT, RULE=AIC2,

             SEAS, SEAST, SEASTSQ, SILENT, TERSE, TREND, TSQ, UNIT, WS)

              <list of variables> [ | <list of special exogenous trend variables> ] ;

or

UNIT (ALL, NOCOINT, CONST, DF, FINITE,

            MAXLAG=<number of lags>, MINLAG=<number of lags>, PP, PRINT, RULE=AIC2,

           SEAS, SEAST, SEASTSQ, SILENT, TERSE, TREND, TSQ, UNIT, WS)

            <list of variables> [ | <list of special exogenous trend variables> ] ;

Usage

List the variables to be tested, and specify the types of tests, maximum number of augmenting lags, and standard constant/trend variables in the options list. The default performs augmented Weighted Symmetric, Dickey-Fuller, and Engle-Granger tests with 0 to 10 lags. If there are any special exogenous trend variables, such as split sample dummies or trends, give their names after a | (see the explanation under General Options below). The observations over which the test regressions are computed are determined by the current sample. If any observations have missing values within the current sample, COINT drops the missing observations and prints a warning message (or an error message, if a discontinuous sample would result).

Output

The default output prints a table of results plus coefficients for each test on each variable. Two summary tables are also printed with just the optimal lag lengths (one for all the unit root tests, and one for the Engle-Granger tests if ALLORD is used).

For each variable, all the specified types of unit root tests are performed. A table is printed for each type of test. Usually, the rows of this table are: the estimated root (alpha), test statistic, P-value, coefficients of trend variables, number of observations, the log likelihood, AIC, and the standard error squared. The columns of this table are the number of augmenting lags. A summary table is also printed which includes just the test statistics and P-values for the optimal lag length.

COINT usually stores most of these results in data storage for later use (except when a 3-dimensional matrix would be required). The summary tables are always stored. If more than one variable is being tested, @TABWS, @TABDF, and @TABPP are not stored. If ALLORD is used, @TABEG is not stored (but @EG, %EG, and @EGLAG will be stored). In the table of output results below,

#regs = MAXLAG-MINLAG+2 (if MINLAG < MAXLAG)

#regs = 1                                      (if MINLAG=MAXLAG)

#stats = 3 + 2*(#trend_vars + #regs(if PRINT is on)) + 4

               + 1 (for PP tsp90013.gif2)

               + #vars*3 + 3 (for Johansen tests)

#types = number of types of unit root tests performed (2 for default, 3 for ALL, etc.)

#eg = number of different cointegrating regressions for Engle-Granger type tests

                 (#vars for ALLORD, or 1 by default).

Here are the results generally available after a COINT command:

Name

Type

Length

Variable Description

@TABWS

matrix

#stats*#regs

table for augmented WS tau tests on a single variable.

@TABDF

matrix

#stats*#regs

augmented Dickey-Fuller tau tests.

@TABPP

matrix

#stats*#regs

Phillips-Perron Z  tests.

@UNIT

matrix

#types*#vars  

summary table of unit root test statistics for optimal lags.

%UNIT

matrix

#types*#vars

P-values for optimal lags.

@UNITLAG

matrix

#types*#vars

Optimal lag lengths chosen by RULE.

@TABEG

matrix

#types*#vars

table for augmented Engle-Granger tests.

@CIVEG

matrix

#types*#vars

cointegrating vector (normalized)

%EG

vector

#eg

P-values for optimal lags.

@EGLAG

vector

#eg

Optimal lag lengths chosen by RULE.

@TABJOH

matrix

#stats*#regs

table for Johansen tests

@CIVJOH

matrix

#vars*#vars*#regs

cointegrating vectors (eigenvectors)

Method

Unit root tests are based on the following regression equation:

Let L=1 for illustration:

All unit root tests are computed from (possibly weighted) OLS regressions on a few lagged or differenced variables. The coefficient of y(t-1) is printed in the tables as alpha. Accurate asymptotic P-values for Dickey-Fuller, Phillips-Perron, and Engle-Granger (for up to 6 cointegrating variables) are computed using the coefficients in the MacKinnon reference. Note that these asymptotic distributions are used as approximations to the true finite-sample distributions.

The WS test is a weighted double-length regression. First the variable being tested is regressed on the constant/trend variables (using the full current sample), and the residual from this is used as the dependent variable Y in the double-length regression. The data setup for the first half of this regression is the same as an augmented Engle-Granger test -- regress Y on lagged Y and lags of DY. The weights are (t-1)/T , where T is @NOB in the original sample. In the second half, Y is regressed on Y(+1) and leads of Y-Y(+1), using weights (1-(t-1)/T). See Pantula et al (1994) for more details. P-values for the WS test are computed very roughly by interpolating between the asymptotic 5% and 10% level critical values given for the constant and no trend case in the reference. These P-values are fine for testing at the 5% and 10% levels, but they are not accurate for testing at other levels. The P-value for the case with a constant and a trend is only good for testing at the 5% level.

The regressions for the Dickey-Fuller tests are quite simple. See the example below which reproduces the Dickey-Fuller tests in the Examples section below.

SMPL 10,70; DY = LRGNP-LRGNP(-1);

? Sample for comparing AIC is the same for all lags.

? MAXLAG+1 observations are dropped.

SMPL 20,70;

TREND T;

DO LAG=1,10;

  SET MLAG = -LAG;

  OLSQ LRGNP LRGNP(-1) C T DY(-1)-DY(MLAG);

  SET alpha = @COEF(1);

  SET tauDF = (alpha - 1)/@SES(1);

  CDF(DICKEYF) tauDF;

ENDDO;

If you are computing this test by hand, it is easier to use:

OLSQ DY LRGNP(-1) C T DY(-1)-DY(MLAG) ;

CDF (DICKEYF) @T(1) ;

The Phillips-Perron test is done with the same Dickey-Fuller regression variables, using no augmenting lags. This test is given in Davidson and MacKinnon, equations (20.17) and (20.18) (see also the warnings there about the possibly poor finite-sample behavior of this test). These tests can be computed for 1 to 10 "lags" by using the following TSP commands (following the Dickey-Fuller example above):

OLSQ (silent) LRGNP C T ;

SET ssr =@SSR ;

SMPL 10,70; ? note that only one observation is dropped, regardless of MAXLAG

TREND T;

OLSQ LRGNP LRGNP(-1) C T; Y = @RES;

SET alpha = @COEF(1); SET s2 = @S2; SET n = @NOB;

FRML EQPP Y = Y0; PARAM Y0;

DO LAG=1,10;

  GMM(INST=C,NMA=LAG,SILENT) EQPP;

  SET w2 = @COVOC;

  SET z = n*(alpha-1) - [n**2*(w2 - s2)]/[2*ssr];

  PRINT LAG,z,w2;

ENDDO;

The regressions for the Engle-Granger tests are just an extension of the Dickey-Fuller test, after an initial cointegrating regression:

TREND T;

OLSQ LRGNP LEMPLOY C T; ? cointegrating regression

E = @RES;

SMPL 10,70; DE = E-E(-1);

SMPL 20,70; ? Estimation sample is the same for all lags -- MAXLAG+1 observations are dropped.

DO LAG=1,10;

SET MLAG = -LAG;

OLSQ E E(-1) DE(-1)-DE(MLAG);

SET alpha = @COEF(1);

SET tauDF = (alpha - 1)/@SES(1); CDF(DICKEYF,NVAR=2) tauDF;

ENDDO;

The following equation defines the L+1 order VAR (Vector Auto Regression) that is used in the Johansen trace test:

where Y(t) is 1 by G and  CT(t) are (seasonal) constants and trends.

The Log Likelihood and AIC printed in the table are from the unrestricted version of this VAR. The restricted version is estimated with a 2G-equation VAR:

where T = number of observations in the current sample, L = MAXLAG = order of VAR beyond 1. The trace tests are labelled H0: r=0 , H0: r<=1 , etc. in the table of results. Note that the trace test includes a finite-sample correction (mentioned in Gregory (1994); originally given in Bartlett(1941)). These trace tests often have size distortions (the null of no cointegration or fewer cointegrating vectors is rejected when it is actually true). P-values are interpolated from the Osterwald-Lenum (1992) tables 0, 1.1*, and 2 (with no constant, constant, or constant & trend). These P-values are adequate for testing at the sizes given in the Osterwald-Lenum tables (.50, .20, .10, .05, .025, and .01). See Cushman et al (1995) for a detailed example of using Johansen tests in an applied setting. They illustrate the importance of the finite sample degrees of freedom correction, the size distortions of the P-values, lag length choice methods, and hypothesis testing.

Options

Unit Root Test Options:

ALL/NOALL perform all available types of unit root tests (WS, DF, and PP).

DF/NODF perform (augmented) Dickey-Fuller (tau) tests.

PP/NOPP perform the Phillips-Perron variation of the Dickey-Fuller (z) test. For the PP test, the number of lags used is the order of the autocorrelation-robust T2 "long run variance" estimate (see the MAXLAG option).

WS/NOWS perform (augmented) Weighted Symmetric (tau) tests. This test seems to dominate the Dickey-Fuller test (and others) in terms of power, so it is performed by default. See Pantula et al (1994) or the Method section for details.

UNIT/NOUNIT use NOUNIT to skip all unit root tests (if you are only interested in cointegration tests, and you are sure which individual variables have unit roots).

Cointegration Test Options: (these apply only if you have more than one variable)

ALLORD/NOALLORD repeat the Engle-Granger tests, using each variable in turn on the left hand side of the cointegrating regression.

EG/NOEG perform (augmented) Engle-Granger tests (Dickey-Fuller test on residuals from the cointegrating regression). The Engle-Granger test is only valid if all the cointegrating variables are I(1); hence the default option to perform unit root tests on the individual series to confirm this before running the Engle-Granger test. Note that if you accept I(1) (i.e. reject I(0)), you will also want to difference the series and repeat the unit root test, to make sure you reject I(2) in favor of I(1). Note that you need to reduce the order of trends when testing such a differenced series -- for example, if the original series had a constant and trend in the equation, the differenced one will only have a constant.

JOH/NOJOH perform Johansen (trace) cointegration tests.

COINT/NOCOINT use NOCOINT to skip all cointegration tests (if you are only interested in unit root tests). You may prefer to use the UNIT or UNIT (NOCOINT) command for this. (UNIT and COINT are synonyms for the same command, except that UNIT has a default of NOCOINT; UNIT may also seem more appropriate if you are just testing one variable).

General Options: (these apply to both unit root and cointegration tests)

CONST/NOCONST include a constant term in the tests. NOCONST implies NOTREND.

FINITE/NOFINITE computes finite sample (vs. asymptotic) P-values when possible (augmented Dickey-Fuller and Engle-Granger tests). See the discussion and references under Method in the CDF entry of this manual. These are distinguished by different labels: P-valFin or P-valAsy ; normally the finite sample P-values will be slightly larger than the asymptotic ones.

MINLAG= smallest number of augmenting lags (default 0). This is denoted as L in the equations under Method. Note that p=L+1 is the total AR order of the process generating y. So L is the number of lags in excess of the first one. For the Phillips-Perron test, L is the number of lags in the "autocorrelation-robust" covariance matrix.

MAXLAG= maximum number of augmenting lags. The default is min(10,2*@NOB(1/3)), which is 10 for 100 observations or below (the factor 2 was chosen arbitrarily to ensure this). If the number of observations in the current sample (@NOB) is extremely small, MAXLAG and MINLAG will be reduced automatically.

RULE= AIC2 or specifies the rule used to choose an optimal lag length (number of augmenting lags), assuming MINLAG < MAXLAG. The default is AIC2, which is described in Pantula et al (1994). If j is the number of lags which minimizes AIC (Akaike Information Criterion), then L = MIN(j+2,MAXLAG) is used. Note that if j = MAXLAG, you will probably want to increase MAXLAG. AIC2 apparently avoids size distortions for the WS and DF tests. AIC2 is also used here for EG tests. No direct rule is used for PP tests yet. Instead, the optimal lag from the DF test is also used for PP (if the DF test is performed at the same time). A plain AIC rule is used for JOH, i.e. L = j (this is not a very good rule for JOH; you may prefer to run the unconstrained VAR and test its residuals for serial correlation). These rules are a topic of current research, so as more useful rules are found, they will be added as options. For example, other possible rules are: (1) testing for remaining serial correlation in the residuals, (2) testing the significance of F-statistics for the last lag of (differenced) lagged variable(s), (3) SBIC (+2?), (4) automatic bandwidth selection for PP (not very encouraging in the current literature).

The current RULE=AIC uses a fixed number of observations for comparing regressions with different numbers of lags. Each regression is a column in the output table. If MINLAG<MAXLAG, then the RULE is used to select an "optimal" number of lags (j). A final column in the table is created for this, labelled "Opt:j". If j is less than MAXLAG, then the regression for this column is computed with the maximum available observations, so the test results may vary slightly from the original column for j.

SEAS/NOSEAS include seasonal dummy variables, such as Q1-Q3. This option implies the CONST option. The SEAS option is available for FREQ Q, 2, or higher. The seasonal coefficients are only printed if PRINT is on.

SEAST/NOSEAST include seasonal trend variables (like Q1*TREND, Q2*TREND, Q3*TREND). This option implies the TREND option.

SEASTSQ/NOSEASTS include seasonal squared trend variables. SEASTSQ implies TSQ.

These are fairly simplistic trend terms, which may not be enough to adequately model a time series that has a change in its intercept and/or trend at some point in the sample. See Perron (1989) for more details. The "special exogenous trend variables" arguments described above may provide a crude examination of more detailed trends. If these variables are supplied, all series are regressed on these trend variables, and the residuals from this regression are used in all tests (instead of the original values of the series). No corrections to the P-values of the tests are made, however (other than in the degrees of freedom for calculating the t-statistics and s2).

TREND/NOTREND include a time trend in the tests.

TSQ/NOTSQ include a squared time trend in the tests.

Output options

PRINT/NOPRINT prints the options, and adds the coefficients and t-statistics of the augmenting lagged difference variables to the main tables.

TERSE/NOTERSE suppresses the main tables (only the summary tables are printed). Note that JOH and EG(NOALLORD) have no summary tables, so TERSE suppresses all their output.

SILENT/NOSILENT suppresses all output. This is useful for running tests for which you only want selected output (which can be obtained from the @ variables, that are stored - see the table below).

Examples

FREQ A; SMPL 1909,1970;

COINT LRGNP LEMPLOY;

performs 11 augmented WS (tau) and Dickey-Fuller (tau) unit root tests with 0 to 10 lags. All tests are first done for LRGNP, then repeated for LEMPLOY. Eleven augmented Engle-Granger (tau) tests are constructed with 0 to 10 lags (with LRGNP as the dependent variable in the cointegrating regression). Optimal lag lengths for all tests are determined using the AIC2 rule. The test is recomputed for the optimal lag, using the maximum available observations, and this is stored in the final column of the table. All tests use a constant and trend variable.

UNIT (ALL) LRGNP;

performs the same unit root tests for LRGNP as the above example. Also performs the Phillips-Perron (z) tests computed separately for 0 to 10 lags in the autocorrelation-robust estimate.

COINT (NOUNIT,ALLORD,MAXLAG=8) LRGNP LEMPLOY LCPI;

performs 27 augmented Engle-Granger (tau) tests. That is, 9 tests with 0 to 8 lags, with LRGNP as the dependent variable in the cointegrating regression. Then repeat the tests, using LEMPLOY and later LCPI as the dependent variable in the cointegrating regression.

SMPL 58:2 84:3;

COINT(JOH,MAXLAG=2,SEAS,NOTREND,NOUNIT,NOEG) Y1-Y4;

reproduces the Johansen-Juselius(1990) results for Finnish data (the chosen number of lags is 1, which matches the results from the paper). The test statistics are smaller than those in the paper, due to the finite-sample correction.

References

Bartlett, M.S., "The Statistical Significance of Canonical Correlations", Biometrika, January 1941, pp. 29-37.

Campbell, John Y., and Pierre Perron, "Pitfalls and Opportunities: What Macroeconomists Should Know about Unit Roots", in Olivier Jean Blanchard and Stanley Fischer, eds, NBER Macroeconomics Annual 1991, MIT Press, Cambridge, Mass., 1991.

Cushman, David O., Sang Sub Lee, and Thorsteinn Thorgeirsson, "Maximum Likelihood Estimation of Cointegration in Exchange Rate Models for Seven Inflationary OECD Countries," in Journal of International Money and Finance, June 1996.

Davidson, Russell, and James G. MacKinnon, Estimation and Inference in Econometrics, Oxford University Press, New York, NY, 1993, Chapter 20.

Dickey, D.A., and W.A. Fuller, “Distribution of the Estimators for Autoregressive Time Series with a Unit Root,” JASA 74 (1979): 427-431.

Gregory, Allan W., "Testing for Cointegration in Linear Quadratic Models," Journal of Business and Economic Statistics, July 1994, pp. 347-360.

Johansen, Soren, and Katarina Juselius, "Maximum Likelihood Estimation and Inference on Cointegration -- with Applications to the Demand for Money", Oxford Bulletin of Economics and Statistics, 1990, p.169-210.

MacKinnon, James G., "Approximate Asymptotic Distribution Functions for Unit-Root and Cointegration Tests," Journal of Business and Economic Statistics, April 1994, pp.167-176.

Osterwald-Lenum, Michael, "Practitioners' Corner: A Note with Quantiles for the Asymptotic Distribution of the Maximum Likelihood Cointegration Rank Test Statistic", Oxford Bulletin of Economics and Statistics, 1992, p.461-471.

Pantula, Sastry G., Graciela Gonzalez-Farias, and Wayne A. Fuller, "A Comparison of Unit-Root Test Criteria," Journal of Business and Economic Statistics, October 1994, pp.449-459.

Perron, Pierre, "The Great Crash, The Oil Price Shock, and the Unit Root Hypothesis," Econometrica, November 1989, pp.1361-1401.

Phillips, P. C. B., "Time Series Regression with a Unit Root," Econometrica, 1987, pp. 277-301.

Phillips, P. C. B., and Pierre Perron, "Testing for a Unit Root in Time Series Regression," Biometrika, 1988, pp. 335-346.