1. Most important, the data frame example does not work.
Most people probably would prefer to use data frames.
Here is the example as written:
dt <- c('year','mars','ltcg',1970,2,100000,1990,2,100000,2016,2,100000)
df <- as.data.frame(as.matrix(dt,ncol=22,byrow=TRUE))
result <- taxsim9(df)
It yields this error when I try to run it:
Inline image 1
The first thing I found wrong was that the 2nd variable should be "mstat" not "mars" but that didn't
fix it - the same error arises.
The code above is sending taxsim9 a data frame with a single column that looks like this (after mars
changed to mstat):
Inline image 2
I wasn't sure what to do with that, as it is not an intuitive format, and it is not clear to me what
is wrong with it, although I can see by the "ncol=22" that it was trying to create a lot more values
than it did.
After some trial end error, I looked on the web and found obviously related code at
https://github.com/SamPortnow/DissertationAnalysis/blob/master/taxsim.R
In looking at that code, it looks like the program is expecting to receive a data frame with either
(1) a single column (which I doubt anyone would use), perhaps similar in format to that above, or
(2) 22 columns.
After experimentation, I determined that it wants a 22 column data frame with the columns in the
canonical order, and then things will work just fine.
Here is the fix that works for me:
A minimal approach to fixing the data frame approach is to:
- create an empty data frame with 22 properly named columns, in the proper order
- create a data frame with the data for the columns you want (3 variables), and merge it against the
empty data frame so that the new data frame has all 22 columns, with data in the 3 columns of
interest
- ensure that the columns are in the canonical order
Here is a working example:
# 1. create a vector with the variable names, in the canonical order
vnames <- c('taxsimid', 'year', 'state', 'mstat', 'depx', 'agex',
'pwages', 'swages', 'dividends', 'otherprop', 'pensions',
'gssi', 'transfers', 'rentpaid', 'proptax', 'otheritem',
'childcare', 'ui', 'depchild', 'mortgage', 'ltcg',
'stcg')
# 2. create an empty data frame (zero rows) with columns in the canonical order
df.empty <- setNames(data.frame(matrix(ncol = 22, nrow = 0)), vnames)
# 3. create a data frame with desired data
df.data <- data.frame(year=c(1970, 1990, 2016), mstat=rep(2, 3), ltcg=rep(100e3, 3))
# 4. merge the two data frames so that the result has same number of rows as df.data, and all
columns, and add taxsimid
df <- merge(df.data, df.empty, all=TRUE)
df$taxsimid <- 1:nrow(df)
# 5. put the columns of the merged data frame in the proper order
df <- df[, vnames]
# 6. call taxsim9
result <- taxsim9(df)
This should work, and it seems reasonably intuitive to me.
______________________________
Don Boyd
Director of Fiscal Studies
Rockefeller Institute of Government
www.rockinst.org/about_us/staff/researchers/boydd.aspx