A Stata plugin in fortran
We have a large (20,000 line) fortran
program that would be difficult to convert to Stata or Mata. There is an existing
interface that uses an ado program to output a text file for processing by the
fortran program. This rather slows things down (the file is sometimes large) and it
appears that a plugin would dispense with this cumbersome work-around. I know that
computers are cheap, but waiting for output is expensive, so efficiency is not
unjustified.
I was able to build a plugin interface without knowing much C. I was
asked to post some information about how it was done, so here is an example using a
5 line fortran program. All this was done on Linux with the GNU compiler collection.
The documentation on the Stata website was clear and accurate, but debugging is
extremely difficult.
Our example program copies the second argument onto the first. It will be invoked
as
plugin call x1 x2
to assign the variable x1 the values of variable x2.
We have the fortran subroutine foo.for with a double precision array as the only
argument. The elements of the array are the value of single observation of x2 or
x(2) (in) and a space for the single output variable (x1 or x(1)). This corresponds
to the way our big program works, but is not the only way a plugin can operate. In
this example plugin foo will copy the first argument into the second, so it is
effectively a "backwards" substitute for -replace- since x1 must exist before the
plugin call. It will be called once for each observation that passes the -if- and
-in- qualifiers.
subroutine foo(x)
double precision x(2)
x(1) = x(2)
return
end
This fortran is very simple - the larger program had multiple subroutines and common
blocks with block data initializations but crucially there is no I/O, no main
program and nothing is stored between invocations. I don't know if the later is
possible in a plugin. This is basic fortran 77, I don't know if modern fortran
constructions would cause difficulty.
We need a C language interface between Stata and foo(). This is adapted from the
example at http://www.stata.com/plugins/ . The only substantive change is the call
to foo_ (note the underscore). You can see the minimal amount of code used to
process -if- and a basic -in- qualifier. Note that since the C array starts at zero,
I define x with 3 elements so that the element numbers can be the same in both
programs.
#include "stplugin.h"
#include
STDLL stata_call(int argc, char *argv[])
{
ST_int j, k;
ST_retcode rc;
ST_double x[3];
if(SF_nvars() < 2) return(102);
for(j = SF_in1(); j <= SF_in2(); j++) {
if(SF_ifobs(j)) {
if(rc = SF_vdata(2,j,&x[2])) return(rc) ;
foo_(&x[1]);
if ( rc = SF_vstore(1,j,x[1])) return(rc);
}
}
return(0) ;
}
Here is how I compiled. The -fPIC option isn't mentioned in the Stata docs, but
seems to be required. The -c causes disables the link step. stplugin.c is on the
Stata web page noted above.
f77 -c foo.for -fPIC
gcc -shared -DSYSTEM=OPUNIX -fPIC -o foo.plugin stplugin.c foo.o foo.c
Here is a Stata program to test the -foo- plugin:
set obs 3
gen x1 = 1
gen x2 = 1/0 if _n==1
replace x2 = 2 if _n==2
replace x2 = . if _n==3
program foo,plugin using("./foo.plugin")
plugin call foo x1 x2
list
Note that the missing values propagate as expected.
Limitations
I expect some of the limitations I found could be overcome with more knowledge - I
would appreciate hearing from anyone who can enlighten me. But I did run into some
missing features that I would have expected to see.
- I couldn't figure out how to write output from fortran.
- No way to access variables by name, only by position in the stata call.
- No way to create an e() or r() return variable.
- No help parsing options.
- Errors were often returned with code r(498), the documented text of which
concludes "The code 498 is not helpful." Some errors hung the terminal or caused
Stata to abend without a message. Fortran runtime messages are suppressed (the
program abends with a complaint about a missing library) and there is no obvious way
to add debugging statements to the fortran.
I was unable to find -stutil.c- which is apparently undocumented but known to
some other users. Perhaps it holds the key.
Without the ability to write from inside the fortran, debugging was so difficult
that I would not suggest others try this unless they had deeper knowledge of mixed
language programming,
See also: http://northstar-www.dartmouth.edu/doc/solaris-forte/manuals/fortran/prog_guide/11_cfort.html
for information on calling fortran from C and vice-versa.An alternative
"suggested" there might be to dispense with the C interface, and call the SF_*
routines directly from fortran. There is apparently a method for passing arguments
by value (required for storing data) that I missed as most tutorials suggested
modifying the C code when calling C routines from fortran. It might be worth
pursuing.
Daniel Feenberg
feenberg@nber.org
Last modified November 10, 2016