Individual Slope Coefficients
Suppose you want a separate estimate for the return to years of schooling
for each sex. You would create two interaction variables. One is zero for
males, and years of schooling for females. The second is zero for females
and years of schooling for males. You might use factor variables:
reg wage sex#c.years ...
but you might not know about factor variables, or the regression command
you use might not support them. You could do this:
gen years1 = 0
replace years1 = years if sex = '1'
gen years2 = 0
replace years2 = years if sex = '2'
There is a special Stata command to do that in one line:
separate years, by(sex)
That is convenient, and quite fast if the number of groups is small. With
10 groups it runs about 300,000 observations per second against twice that
speed for the gen-replace sequence. However, it becomes less efficient
with more groups. With 100 groups, it runs at less than 200 observations
per second, about 1,000 times slower than gen-replace or using factor
If you have a large number of groups, Clint Cummins (maintainer of
TSP) has a suggested method that eliminates
the proportionality to the number of groups. It is presented as TSP code
but could easily be converted to Stata. Something like this is essential
if the number of groups exceeds Stata's limit on the number of variables,
as it will if it each doctor or hospital in the US or each participant in
a large survey is given a separate estimate for the nuisance variables.
The moral is - you needn't brute-force a large problem.