Weak Test of Accuracy

Here we compare some calculated quantites using TAXCALC to the taxpayer's own calculation by running a regression of the calculated quantity on a constant and the taxpayer calculation. A perfect result has
  • A zero constant
  • A unit coefficient on the taxpayer calculation
  • A high R**2

For this test we exclude returns with the number of dependents censored, part year returns, kiddie tax with gains, compulsory itemizers, Form 4972 or Schedule J.

The Sas code is somewhat more complex than previous examples, due to our desire to print the output in a more compact form than the SAS default. Equivalent Stata code is simpler and achieves the same results

Detailed Results

The OTA results are from an earlier version of the calculator.

We do not rule out programming errors. It is difficult to debug code that applies only to the PUF, since disclosure avoidance provides an alternate explanation for most discrepancies.

There are some individual elements for which the calculator result is arguably better than the SOI data.

  • In 2001 the Rate Reduction Credit is excluded from Tax after Credits reported in the PUF, perhaps because it was not credited on the 1040.
  • In 2003 the Child Tax Credit reported by SOI excludes the Advance Credit portion, perhaps because it isn't reported on the 1040.
  • In all years, we create a proxy earnings variable from wages and self-employment income, and use that for EIC earnings, rather than using the supplied EIC earnings value. This is so that calculations of the marginal tax rate on earnings include the EIC phase-in and clawback.
There are some areas where the data is lacking:
  • The 5% and 15% Schedule D rate categories are in effect for only part of 2003, but the PUF does not distinguish time of year for gains.
  • The number of children eligible for the child tax credit is given only 2002+.

The last two links are to the results when the sample is restricted to returns for the prior year, or the year before that. This tests the ability of the code to calculate correct liability for data from another year. As expected the results are somewhat worse than for timely returns, however most R**2s are .99 or better.