SDDB - School District Database (NCES 95-705)
School District Database (SDDB)This is a special tabulation of the 1990 Census long forms, done by school district boundary and children's ages. The tabulations cover a large number of combinations of household type, race, income etc. Superficially this allows you to learn about the characteristic of school districts, at a similar level of detail to the City and County Databook. Such a file would take only a few megabytes, however. So why does the SDDB fill more than 100 CD-ROM disks? Because it provides that information catagorized by children's age, grade level or enrollment status. That is, you can not only find the number of pre-kindergarten children by district, but also the number of black pre-kindergarten children in non-poverty single-parent households by district. The effective number of variables is overwhelming, but this is not micro-data.
This VersionThe SDDB (now called the SDDS) was sponsored by the National Center for Educational Statistics, which offers a compressed version on 44 CD-ROM disks. The SDDS web site has more information. Due to the difficulty in using that version of the dataset for analytical purposes the NBER has purchased an uncompressed copy of the source dataset from the National Archives and is making it available here in a slightly improved format. The NARA format data is available on this web site, however we also have a slightly improved and much more compact format which drops the age/grade/enrollment status breakdown.
The FieldsEach state has eight ASCII files containing information about the demographic character of school districts. This data is Census tract data aggregated to the school district boundary level, and nearly all items are numbers of persons or households in particular categories by district. A few items are dollar amounts of expenditure or income. There are separate record types for the following universes. Select the link for a complete codebook for that record type:
For record types 1, 2a and 2b, there is one record for each school district. The fields show the demographic characteristics of the district. For example, variable 21 on record type HT shows the number of non-family households in poverty for each district. (Select HT above and then select P019b
In the original datasets record types 3 through 7 are each iterated through 42 cpmbinations of each enrollment status and age or grade level. At this time the NBER files omit these iterated records, but they could be made available if there was interest.
Here is format of the record identification prefix, the first 40 bytes of each record (but see also this:
Organization of the NBER Format File
The original file format was remarkably large and unwieldly. Data for individual states was spread across as many as 11 different CDs, and different record types were mixed in the same file. The file is composed almost entirely of 9 byte fields for population counts, but with a handfull of 18 byte fields for dollar amounts and a 40 byte prefix. Since the only item location information given in the documentation was the field number, it is tedious to determine the starting and ending byte for any particular field (which depends on the number of 9 and 18 bytes fields preceeding it in the record).
To make a more accessible version of the SDDB, we have created an NBER format file. In this format we have allocated 40 bytes for the prefix, 10 bytes for every field, and divided all aggregate dollar amounts by 1,000. With those changes it is easy to translate any field number into a byte location, there are no field overflows, items are separated by spaces, and the loss of precision is not significant. The reformatting makes it easier, not harder to use the original documentation.
Any field N starts at byte 31+10*N and ends at 40+10*N in the NBER version of the files. Here is an example of how to determine the byte location of a variable on one of the files. Consider "Persons in Household" (Table P016 on the HT record). Find it by selecting on the "All Households" link just above, then on the third table link. At the far right of that page you can see that the number of one person households is given in variable number 3 on that file. The number of two person households in variable 4, etc. So those two variables are at byte locations 61-70 and 71-80 on each record of the HT file. Then the individual records required for the enrollment status by age or grade are located from the information in the prefix (shown above).
Currently we have and have online 156 files, which NARA says is the whole file, or least all they have. They believe there may be records, or parts of records missing from California and Minnesota. We have observed that Minnesota contains 603 undecipherable records. The only examination we have done of the files supplied is to check for proper record seqence. There are problems in about half the files, but it is possible that the only problem is sequence.
References and Other Information