Friday, 19 November 2010

Surprise Use of Forensic Software in Archives

When I first heard of the use of computer forensic in archives, I was excited and wanted to learn how these law enforcement techniques could help me do a better job in processing digital collections. After learning people are using computer forensic to copy disk image (i.e. an exact copy of a disk, bit by bit) and to create a comprehensive manifest of the electronic files of collections, I was a bit disappointed because software engineers have been using the Unix dd command for many years to copy disk images. Also, there are tools (e.g. Karens's Directory Printer) available to create comprehensive manifest of the electronic files of collections. Data recovery is another feature of forensic software some people consider useful for archivists/researchers. In my opinion, data recovery may be useful for researchers but not archivists. Without informed written consents from donors, archivists should NOT recover the deleted files at all. Also, in some cases, a deleted file doesn't appear as one file, but instead, tens or hundreds of segments of files. When most archivists don't do item level processing in paper collections due to limited resources, I can't image archivists performing sub-item level processing in digital collections. Computer forensic in criminal application usually look for particular evidences. Organizing all files in a disk drive is usually not their interest. In archives, we are organizing all files in disk drives, looking for particular items are not our duty. Computer forensic may be more useful for researchers when they want to look for particular items. All these lead me to the conclusion that computer forensic may not be very useful for digital archivists.

However, after attending a 2.5-days training on AccessData FTK (a computer forensic software), I started to see the potential of using forensic software to process digital archives. I found out that the functions (bookmarks, labels) which help investigators to organize the evidence they selected are equally applicable to the organization of the whole collection. The functions (pattern and full text search) which are used to found particular evidence are equally applicable to search for restricted materials. I can also use the software to replace a number of software I am using to processing digital collections. Although, 90% of the training is related to cracking passwords, searching on delete files, identifying pornographic images, etc., I found the 10% the course worth every cents Stanford spent on it. Of course, the ideal case would be a course tailored for the archival community, but unfortunately, there is no such course exists.

Now, I am using AccessData FTK to replace the following software I used in the past to process digital archives.
Karens's Directory Printer - to create a comprehensive manifest of the electronic files of collections
QuickView Plus - to view files with obsolete file formats
Xplorer - to find duplicate files, copy to folders
DROID, JHOVE - to extract technical metadata: file formats, checksums, creation/modification/access dates
Windows Search 4.0 - to perform full text search on files with certain formats (word, pdf, ascii);

I am using following functions, which I have not found software package to perform in a very user friendly manner, in AccessData FTK to process digital archives.
Pattern search (to locate files containing restricted information such as social security no, credit card no., etc.)
Assign bookmarks, labels to files (for arranging files into series/subseries, other administrative and descriptive metadata)
Extract email headers (to; from; subject; date; cc/bcc) from emails written in different email programs for preparing correspondence listing.

The cost of licensing the software seems high. But if you look at the total costs of learning several "free" software, the lack of support for such software, and the integrated environment you get in using on software, you may find the total costs of using commercial forensic software is cheaper than using "free" software.


  1. Hi Peter, I found the free version of FTK Imager (FTK Imager Lite) extremely useful to package up digital archives for transfer. Of course, it does not have the full functionality of the complete package, but does a perfectly good job of creating a manifest, imaging a directory or specific folders, & generating and checking sums. I imported the CSV files into Excel for further manipulation. An added advantage is that it can be downloaded and run directly from a portable USB drive, so the archivist can collect the digital files without having to install software on the creator's system. It can be downloaded from Documentation isn't great, but I more or less picked up how it worked in an afternoon and wrote my own manual for archivists using the software afterwards.

  2. Hi Alexandra, I am using FTK Imager to create disk / logical images and import the images to FTK to generate technical metadata and to assign descriptive metadata. You may want to take a look at Forensic Toolkit® (FTK™) version 1.81.6 which can be downloaded from the same place you download FTK Imager. You can use the software free of charge for cases less than 4000 records.