Difference between revisions of "Automated Computer Forensics"

From Simson Garfinkel
Jump to navigationJump to search
m
 
(26 intermediate revisions by the same user not shown)
Line 1: Line 1:
We are developing a variety of techniques and tools for performing ''Automated Document and Media Exploitation'' (ADOMEX). The thrust of this research consists of several thrusts:
+
==Current Research Areas==
# Developing open source tools for working with electronic evidence. This work is part of the [http://www.afflib.org AFF] project.
+
One of my primary areas of research is the development of algorithms, techniques, and eventually tools for automating a wide variety of computer forensics tasks that are currently performed by trained analysts. Today much work performed by computer analysts is performed with visualization tools that allow an analyst to search for data on a hard drive or captured from a network and slowly construct a story that might be useful in a prosecution or in recovering from a security event. But as data volumes increase and the network environment becomes increasingly complex, there is a need for increasingly automated tools that can perform autonomous analysis and correlation<ref>Garfinkel, S. [http://simson.net/clips/academic/2007.ACM.Domex.pdf "Document and Media Exploitation,"] <i>ACM Queue</i>, November/December 2007.</ref><ref>Garfinkel, Simson, Digital Forensics Research: The Next 10 Years , DFRWS 2010, Portland, OR</ref>
# Developing an unclassified [[Real Data Corpus]] (RDC) consisting of "real data from real people" that can be used to develop new algorithms and test automated tools.
 
# Developing new algorithms and approaches for working in a "data-rich environment."
 
  
==Recent Research Developments==
+
Today my research into this field of automated computer forensics covers these main areas:
===File-based Forensics===
 
* We have developed a batch analysis tool called system called '''fiwalk''' which can take a disk image and produce an XML file corresponding to all of the files, deleted files, orphan files, and all of the extracted file metadata from a disk image. This XML file can be used as an input to enable further automated media processing. Using this system we have created a variety of applications for reporting and manipulating disk images. We have also developed an efficient system for allowing remote file-level access of disk images using XML-RPC and REST. Details can be found in our paper<ref>[http://simson.net/clips/academic/2009.SADFE.xml_forensics.pdf Automating Disk Forensic Processing with SleuthKit, XML and Python], [http://conf.ncku.edu.tw/sadfe/sadfe09/ Fourth International IEEE Workshop on Systematic Approaches to Digital Forensic Engineering] (IEEE/SADFE'09), May 2009</ref>.
 
  
* We have developed a prototype system for performing automated media forensic reporting. Based on PyFlag, the system performs an in-depth analysis of captured media, locates local and online identities, and presents summary information in a report that is tailed to be easy for the consumer of forensic intelligence. Details can be found in [http://www.simson.net/clips/students/09_Farrell.pdf A Framework for Automated Digital Forensic Reporting], Lt. Paul Farrell, Master's Thesis, Naval Postgraduate School, Monterey, CA, March 2009
+
# '''Small-block forensics'''---Exploring approaches for working with data elements in the 4KiB to 64KiB range and that are not aligned with file boundaries.  This can be used in situations where an entire file is not available for reconstruction, or only a portion of a file is available for analysis. Small block forensics can be used to enable approaches based on statistical sampling rather than full-content analysis.<ref>Simson Garfinkel, Vassil Roussev, Alex Nelson and Douglas White, Using purpose-built functions and block hashes to enable small block and sub-file forensics, DFRWS 2010, Portland, OR</ref>
 +
# '''Data-rich algorithms and approaches''' that are designed to work in environments where there is a large collection of data from multiple users, as can be the case in law enforcement, e-discovery, and internal corporate investigations. <ref>Garfinkel, S., [http://simson.net/clips/academic/2006.DFRWS.pdf Forensic Feature Extraction and Cross-Drive Analysis,]The 6th Annual Digital Forensic Research Workshop Lafayette, Indiana, August 14-16, 2006.</ref>
 +
# '''Media/Web correlation''' --- Exploring opportunities for automatic correlation of information on hard drives with information that can be found on the web.
 +
# '''Corpus Creation''' --- Developing realistic corpora that can be used in education and software development that do not contain personal information.<ref>Garfinkel, Farrell, Roussev and Dinolt, [http://www.simson.net/clips/academic/2009.DFRWS.Corpora.pdf Bringing Science to Digital Forensics with Standardized Forensic Corpora], DFRWS 2009, Montreal, Canada. [http://simson.net/clips/academic/2009.DFRWS.Corpora.slides.pdf (slides)]</ref>
  
===Bulk Data Forensics===
+
Related work areas that I am not personally involved in includes:
* We have developed a tool called '''[http://www.forensicswiki.org/wiki/Frag_find frag_find]''' which can report if sectors of a TARGET file are present on a disk image. This is useful in cases where a TARGET file has been stolen and you wish to establish that the file has been present on a subject's drive. If most of the TARGET file's sectors are found on the IMAGE drive---and if the sectors are in consecutive sector runs---then the chances are excellent that the file was once there. Frag_find performs this search using time-and-space efficient data structures arranged in multiple filtering layers. The program deals with the problem of non-unique blocks by looking for runs of matching blocks, rather than individual blocks. Frag_find is part of the NPS Bloom package, which can be downloaded from http://www.afflib.org.
+
# Approaches for '''gisting''' and clustering documents based on their content.
 +
# Approaches that are tuned to human languages other than English.  
  
* bulk_extractor
 
 
* CDA tool
 
  
 
==Relevant Publications==
 
==Relevant Publications==
 
<references/>
 
<references/>
See also:
+
 
* [http://www.simson.net/clips/academic/2008.ACSAC.Bloom.pdf “Practical Applications of Bloom filters to the NIST RDS and hard drive triage,”] Farrell, Garfinkel and White, ACSAC 2008
+
__NOTOC__
* [http://www.simson.net/clips/academic/2007.DFRWS.pdf "Carving Contiguous and Fragmented Files with Fast Object Validation"], Garfinkel, S., Digital Investigation, Volume 4, Supplement 1, September 2007, Pages 2--12.
 
* [http://www.simson.net/clips/academic/p42-garfinkel.pdf "Complete Delete vs. Time Machine Computing,"] Garfinkel, S., Operating Systems Review, ACM Special Interest Group on Operating Systems, January 2007.
 
* [http://www.simson.net/clips/academic/2006.DFRWS.pdf "Forensic Feature Extraction and Cross-Drive Analysis,"] Garfinkel, S., Digital Investigation, Volume 3, Supplement 1, September 2006, Pages 71--81.
 
* [http://www.simson.net/clips/academic/2006.CACM.AFF.pdf "AFF: A New Format for Storing Hard Drive Images,"] Garfinkel, S., Communications of the ACM, February, 2006.
 
* [http://www.simson.net/clips/academic/2006.CACM.digital_evidence.pdf "Standardizing Digital Evidence Storage,"] The Common Evidence Format Working Group (Carrier, B., Casey, E., Garfinkel, S., Kornblum, J., Hosmer, C., Rogers., M., and Turner., P.,)  Communications of the ACM, February, 2006.
 

Latest revision as of 12:40, 17 March 2018

Current Research Areas

One of my primary areas of research is the development of algorithms, techniques, and eventually tools for automating a wide variety of computer forensics tasks that are currently performed by trained analysts. Today much work performed by computer analysts is performed with visualization tools that allow an analyst to search for data on a hard drive or captured from a network and slowly construct a story that might be useful in a prosecution or in recovering from a security event. But as data volumes increase and the network environment becomes increasingly complex, there is a need for increasingly automated tools that can perform autonomous analysis and correlation[1][2]

Today my research into this field of automated computer forensics covers these main areas:

  1. Small-block forensics---Exploring approaches for working with data elements in the 4KiB to 64KiB range and that are not aligned with file boundaries. This can be used in situations where an entire file is not available for reconstruction, or only a portion of a file is available for analysis. Small block forensics can be used to enable approaches based on statistical sampling rather than full-content analysis.[3]
  2. Data-rich algorithms and approaches that are designed to work in environments where there is a large collection of data from multiple users, as can be the case in law enforcement, e-discovery, and internal corporate investigations. [4]
  3. Media/Web correlation --- Exploring opportunities for automatic correlation of information on hard drives with information that can be found on the web.
  4. Corpus Creation --- Developing realistic corpora that can be used in education and software development that do not contain personal information.[5]

Related work areas that I am not personally involved in includes:

  1. Approaches for gisting and clustering documents based on their content.
  2. Approaches that are tuned to human languages other than English.


Relevant Publications

  1. Garfinkel, S. "Document and Media Exploitation," ACM Queue, November/December 2007.
  2. Garfinkel, Simson, Digital Forensics Research: The Next 10 Years , DFRWS 2010, Portland, OR
  3. Simson Garfinkel, Vassil Roussev, Alex Nelson and Douglas White, Using purpose-built functions and block hashes to enable small block and sub-file forensics, DFRWS 2010, Portland, OR
  4. Garfinkel, S., Forensic Feature Extraction and Cross-Drive Analysis,The 6th Annual Digital Forensic Research Workshop Lafayette, Indiana, August 14-16, 2006.
  5. Garfinkel, Farrell, Roussev and Dinolt, Bringing Science to Digital Forensics with Standardized Forensic Corpora, DFRWS 2009, Montreal, Canada. (slides)