Automated Computer Forensics

From Simson Garfinkel
Jump to navigationJump to search

We are developing a variety of techniques and tools for performing Automated Document and Media Exploitation (ADOMEX). The thrust of this research consists of several thrusts:

  1. Developing open source tools for working with electronic evidence. This work is part of the AFF project.
  2. Developing an unclassified Real Data Corpus (RDC) consisting of "real data from real people" that can be used to develop new algorithms and test automated tools.
  3. Developing new algorithms and approaches for working in a "data-rich environment."

Recent Research Developments

File-based Forensics

  • We have developed a batch analysis tool called system called fiwalk which can take a disk image and produce an XML file corresponding to all of the files, deleted files, orphan files, and all of the extracted file metadata from a disk image. This XML file can be used as an input to enable further automated media processing. Using this system we have created a variety of applications for reporting and manipulating disk images. We have also developed an efficient system for allowing remote file-level access of disk images using XML-RPC and REST. Details can be found in our paper[1].
  • We have developed a prototype system for performing automated media forensic reporting. Based on PyFlag, the system performs an in-depth analysis of captured media, locates local and online identities, and presents summary information in a report that is tailed to be easy for the consumer of forensic intelligence[2].

Bulk Data Forensics

  • We have developed a tool called frag_find which can report if sectors of a TARGET file are present on a disk image. This is useful in cases where a TARGET file has been stolen and you wish to establish that the file has been present on a subject's drive. If most of the TARGET file's sectors are found on the IMAGE drive---and if the sectors are in consecutive sector runs---then the chances are excellent that the file was once there. Frag_find performs this search using time-and-space efficient data structures arranged in multiple filtering layers. The program deals with the problem of non-unique blocks by looking for runs of matching blocks, rather than individual blocks. Frag_find is part of the NPS Bloom package, which can be downloaded from
  • bulk_extractor
  • CDA tool

Relevant Publications

See also: