Individual Slope Coefficients

Suppose you want a separate estimate for the return to years of schooling for each sex. You would create two interaction variables. One is zero for males, and years of schooling for females. The second is zero for females and years of schooling for males. You might use factor variables: reg wage sex#c.years ... but you might not know about factor variables, or the regression command you use might not support them. You could do this: gen years1 = 0 replace years1 = years if sex = '1' gen years2 = 0 replace years2 = years if sex = '2' There is a special Stata command to do that in one line: separate years, by(sex) That is convenient, and quite fast if the number of groups is small. With 10 groups it runs about 300,000 observations per second against twice that speed for the gen-replace sequence. However, it becomes less efficient with more groups. With 100 groups, it runs at less than 200 observations per second, about 1,000 times slower than gen-replace or using factor variables.

If you have a large number of groups, Clint Cummins (maintainer of TSP) has a suggested method that eliminates the proportionality to the number of groups. It is presented as TSP code but could easily be converted to Stata. Something like this is essential if the number of groups exceeds Stata's limit on the number of variables, as it will if it each doctor or hospital in the US or each participant in a large survey is given a separate estimate for the nuisance variables.

The moral is - you needn't brute-force a large problem.