Can Intelligence Agencies Read Overwritten Data? A repsonse to Gutmann.

Claims that intelligence agencies can read overwritten data on disk drives have been commonplace for many years now. The most commonly cited source of evidence for this supposed fact is a paper (Secure Deletion of Data from Magnetic and Solid-State Memory) by Peter Gutmann presented at a 1996 Usenix conference. I found this an extraordinary claim, and therefore deserving of extraordinary proof. Thanks to an afternoon at the Harvard School of Applied Science library I have had a chance to examine the paper ( http://www.usenix.org/publications/library/proceedings/sec96/full_papers/gutmann/index.html ) and many of the references contained therein.

Of course, modern operating systems can leave copies of " deleted" files scattered in unallocated sectors, temporary directories, swap files,remapped bad blocks, etc, but Gutmann believes that an overwritten sector sector can be recovered under examination by a sophisticated microscope and this claim has been accepted uncritically by numerous observers. I don't think these observers have followed up on the references, however.

Gutmann explains that when a 1 bit is written over a zero bit, the "actual effect is closer to obtaining a .95 when a zero is overwritten with a one, and a 1.05 when a one is overwritten with a zero". Given that, given a read head 20 times as sensitive as the one in the drive, and given the pattern of overwrite bits one could recover the under-data. This immediately suggests that if random (not pseudo-random) data is used to overwrite the sensitive information there will be no possibility of retreival since the overwrite must be known to calculate the overwritten bits.

The references Gutmann provides suggest that his piece is much overwrought. None of the references lead to examples of sensitive information being disclosed. Rather, they refer to experiments where STM microscopy was used to examine individual bits, and some evidence of previously written bits was found. The overwrite was always known.

There is a large literature on the use of Magnetic Force Scanning Tunneling Microscopy (STM) to image bits recorded on magnetic media. The apparent point of this literature is not to retrieve overwritten data, but to test and improve the design of drive read/write heads. Two of the references [4][7] had pictures of overwritten bits, showing parts of the original data clearly visible in the micro-photograph. These were considered by the authors as examples of sub-optimal head design. The total number of bits seen was 6 in one photo and 8 in the other. Neither photo-micrograph was a total success, because in one case only transitions from one to zero were visible, and in the other case one of the transitions was ambiguous. Nevertheless, I accept that overwritten bits might be observable under certain circumstances.

So I can say that Gutmann doesn't cite anyone who claims to be reading the under-data in overwritten sectors, nor does he cite any articles suggesting that ordinary wipe-disk programs (with true random data) wouldn't be completely effective.

I should qualify that last paragraph a "bit". I was unable to locate a copy of the masters thesis with the tantalizing title "Detection of Digital Information from Erased Magnetic Disks" by Venugopal Veeravalli. However a brief visit to his web page shows that this was never published, he has never published on this or a related topic (his field is security of mobile communications) and his other work does not suggest familiarity with STM microscopes. So I am fairly sure he didn't design a machine to read under-data with an "unwrite" system call. In an email message to me Dr. Veeravalli said that his work was theoretical, and studied the possibility of using DC erase heads.

Gutmann claims that "Intelligence organisations have a lot of expertise in recovering these palimpsestuous images." but there is no reference for that statement. There are 18 references in the paper, but none of the ones I was able to locate even referred to that possibility. Subsequent articles by diverse authors do make that claim, but only cite Gutmann, so they do not constitute additional evidence for his claim.

In one section of the paper Gutmann suggests overwriting with 4 passes of random data. That is apparently because he anticipates using pseudo-random data that would be known to the investigator. A single write is sufficient if the overwrite is truely random, even given an STM microscope with far greater powers than those in the references.

After posting this information to a local mailing list, I received a reply suggesting that the recovery of overwritten data was an industry, and that a search on Google for "recover overwritten data" would turn up a number of firms offering this service commercially. Indeed it does turn up many firms, but all but one are quite explicit that they can recover "overwritten files", which is quite a different matter. An overwritten file is one whose name has been overwritten, not its sectors. Likewise, partitioning, formatting, and "Ghosting" typically affect only a small portion of the physical disk, leaving plenty of potential for sector reads to reveal otherwise hidden data. There is no implication in the marketing material that these firms can read physically overwritten sectors. The one exception I found (Dataclinic) did not respond to an email enquiry, and they do not mention any STM facility on their web site.

Of course it has been 6 years since Gutmann published. Perhaps microscopes have gotten better? Yes, but data densities have gotten higher too. A hour on the web this month looking at STM sites failed to come up with a single laboratory claiming it had an ability to read overwritten data.

Another fact to ponder is the failure of anyone to read the "18 minute gap" Rosemary Woods created on the tape of Nixon discussing the Watergate breakin. In spite of the fact that the data density on an analog recorder of in the 1960s was approximately one million times less than current drive technology, and that audio recovery would not require a high degree of accuracy, not one phoneme has been recovered.

The requirements of intelligence agencies that disk drives with confidential information be destroyed rather than erased is sometimes offered as evidence that these agencies can read overwritten data. I expect the real explanation is far more prosaic. The technician tasked with discarding a hard drive may or may not have enough computer knowledge to know if running the command "urandom >/dev/sda2c1" has covered an entire disk with random data, or only one partition, nor is it easy to confirm that it was done. How would you confirm that the overwrite was not pseudo-random? Smashing the drive with a sledgehammer is easy to do, easy to confirm, and very hard to get wrong.

Surveying all the references, I conclude that Gutmann's claim belongs in the category of urban legend.

Or it may be in the category of marketing hype. I note that it is being used to sell a software package called "The Annililator".

An updated copy of this memo will be kept at http://www.nber.org/sys-admin/stm.html. Additional information may be sent to feenberg@nber.org.

Daniel Feenberg
National Bureau of Economic Research
Cambridge MA
USA
22 Feb 2003