A Stata plugin in fortran

We have a large (20,000 line) fortran program that would be difficult to convert to Stata or Mata. There is an existing interface that uses an ado program to output a text file for processing by the fortran program. This rather slows things down (the file is sometimes large) and it appears that a plugin would dispense with this cumbersome work-around. I know that computers are cheap, but waiting for output is expensive, so efficiency is not unjustified.

I was able to build a plugin interface without knowing much C. I was asked to post some information about how it was done, so here is an example using a 5 line fortran program. All this was done on Linux with the GNU compiler collection. The documentation on the Stata website was clear and accurate, but debugging is extremely difficult.

Our example program copies the second argument onto the first. It will be invoked as

plugin call x1 x2 to assign the variable x1 the values of variable x2.

We have the fortran subroutine foo.for with a double precision array as the only argument. The elements of the array are the value of single observation of x2 or x(2) (in) and a space for the single output variable (x1 or x(1)). This corresponds to the way our big program works, but is not the only way a plugin can operate. In this example plugin foo will copy the first argument into the second, so it is effectively a "backwards" substitute for -replace- since x1 must exist before the plugin call. It will be called once for each observation that passes the -if- and -in- qualifiers.

subroutine foo(x) double precision x(2) x(1) = x(2) return end This fortran is very simple - the larger program had multiple subroutines and common blocks with block data initializations but crucially there is no I/O, no main program and nothing is stored between invocations. I don't know if the later is possible in a plugin. This is basic fortran 77, I don't know if modern fortran constructions would cause difficulty.

We need a C language interface between Stata and foo(). This is adapted from the example at http://www.stata.com/plugins/ . The only substantive change is the call to foo_ (note the underscore). You can see the minimal amount of code used to process -if- and a basic -in- qualifier. Note that since the C array starts at zero, I define x with 3 elements so that the element numbers can be the same in both programs.

#include "stplugin.h" #include <stdio.h> STDLL stata_call(int argc, char *argv[]) { ST_int j, k; ST_retcode rc; ST_double x[3]; if(SF_nvars() < 2) return(102); for(j = SF_in1(); j <= SF_in2(); j++) { if(SF_ifobs(j)) { if(rc = SF_vdata(2,j,&x[2])) return(rc) ; foo_(&x[1]); if ( rc = SF_vstore(1,j,x[1])) return(rc); } } return(0) ; } Here is how I compiled. The -fPIC option isn't mentioned in the Stata docs, but seems to be required. The -c causes disables the link step. stplugin.c is on the Stata web page noted above. f77 -c foo.for -fPIC gcc -shared -DSYSTEM=OPUNIX -fPIC -o foo.plugin stplugin.c foo.o foo.c Here is a Stata program to test the -foo- plugin: set obs 3 gen x1 = 1 gen x2 = 1/0 if _n==1 replace x2 = 2 if _n==2 replace x2 = . if _n==3 program foo,plugin using("./foo.plugin") plugin call foo x1 x2 list Note that the missing values propagate as expected.

Limitations

I expect some of the limitations I found could be overcome with more knowledge - I would appreciate hearing from anyone who can enlighten me. But I did run into some missing features that I would have expected to see.
  1. I couldn't figure out how to write output from fortran.
  2. No way to access variables by name, only by position in the stata call.
  3. No way to create an e() or r() return variable.
  4. No help parsing options.
  5. Errors were often returned with code r(498), the documented text of which concludes "The code 498 is not helpful." Some errors hung the terminal or caused Stata to abend without a message. Fortran runtime messages are suppressed (the program abends with a complaint about a missing library) and there is no obvious way to add debugging statements to the fortran.

I was unable to find -stutil.c- which is apparently undocumented but known to some other users. Perhaps it holds the key.

Without the ability to write from inside the fortran, debugging was so difficult that I would not suggest others try this unless they had deeper knowledge of mixed language programming,

See also: http://northstar-www.dartmouth.edu/doc/solaris-forte/manuals/fortran/prog_guide/11_cfort.html for information on calling fortran from C and vice-versa.An alternative "suggested" there might be to dispense with the C interface, and call the SF_* routines directly from fortran. There is apparently a method for passing arguments by value (required for storing data) that I missed as most tutorials suggested modifying the C code when calling C routines from fortran. It might be worth pursuing.

Daniel Feenberg feenberg@nber.org


Last modified November 10, 2016