Foremost is a console program to recover files based on their headers, footers, and internal data structures. This process is commonly referred to as data carving. Foremost can work on image files, such as those generated by dd, Safeback, Encase, etc, or directly on a drive.
The headers and footers can be specified by a configuration file or you can use command line switches to specify built-in file types. These built-in types look at the data structures of a given file format allowing for a more reliable and faster recovery.
- Search for jpeg format skipping the first 100 blocks
foremost -s 100 -t jpg -i image.dd
- Only generate an audit file, and print to the screen (verbose mode)
foremost -av image.dd
- Search all defined types
foremost -t all -i image.dd
- Search for gif and pdf’s
foremost -t gif,pdf -i image.dd
- Search for office documents and jpeg files in a Unix file system in verbose mode.
foremost -vd -t ole,jpeg -i image.dd
- Run the default case
In first step, create a blank directory for recovered files by typing “mkdir <folder name>” and give 777 permissions with chmod 777 <folder name>.
Data sets in digital investigations and forensic research are usually comprised of a forensic image of a target device; for example, a bitwise copy of a computer’s hard drive. However, in order to correctly evaluate file carving tools and produce reliable results, detailed knowledge of the data contained within the data set is essential.
Therefore, specific purpose based data sets for testing file carving tools were used. Each data set has extensive documentation including the following details:
- File name
- File type
- MD5 hash value
- File location (offset)
- File scenario.
What is Data Carving ? Data carving is the black art of creating order out of chaos. The theory is you take a blob of electronic data, search it for file signatures that may indicate user-created documents and e-mail, and then “carve” that data out of the blob into the software’s best guess of how the file used to look.
Navigate to Download directory and unzip the file by typing “unzip <file name>” as shown below which extracts the final .dd file which we need as an input to foremost.
Installation of Foremost
- If you are using Kali Linux, then you don’t need to install foremost, simply type apt-get update and then run foremost from the terminal screen.
- If you are using any other distro like Ubuntu, then you can easily foremost by typing “sudo apt-get install foremost“.
Syntax for using foremost is:
foremost -i <your dd file> -o <output folder> -v
Where, -v is for verbose mode or by -t you can also define the extension name which you want to recover like “-t jpeg” and “-t all” is for all extensions.
In our case, our dd file is located at /root/Downloads/11-carve-fat/11-carve-fat.dd.
foremost -i /root/Downloads/11-carve-fat/11-carve-fat.dd -o yeahhub-recovery -v
Note: Make sure that your output directory should be fully empty, otherwise it will encounter problems for sure.
Here you can see that, foremost tool recovered all the files from .dd image and well summarized in screen with all the possible details like Name, Number, Size, File offset and Comment etc.
Once foremost has completed the carving process, you can view the final report located at yeahhub-recovery folder (in our case) with name audit.txt and can be viewed by typing “cat audit.txt“.
If you open the output directory, you can see the carved items, categorized by file type i.e. JPEG, WAV, MP3 etc, along with an audit.txt file, which contains details of the findings as shown above.
Other File Carving Techniques – (From Source)1. Header-Based Carving: Files have unique headers, also known as magic numbers or file signatures. These unique values can be used to help identify the beginning of a file and aid in carving files without the corresponding metadata. Header-footer carving is the most basic carving technique which searches data for patterns that mark a distinct header (start of file marker) and footer (end of file marker)
The process is achieved by extracting all data contained within the headers and footers and copying that data into an external file. An alternative header-based carving technique is header-maximum size carving. When a header is discovered (with no footer value available), the maximum carve size is used to calculate how far away from the header the end of the file might be.
As some file types can vary dramatically in size, this technique can have varying results and can also increase the size needed to store recovered files.
However, it remains a viable approach because many file formats (e.g. JPEG, MP3) are not affected if additional data is appended to the end of a valid file. Another header-based carving technique is header-embedded length carving. Some file formats have internal file information which specifies the length, or size, of the file and provides an identified point for the footer of the file.
2. File Structure Carving: Another file carving technique is based on the internal structure of a file, where specific knowledge of the contents can help
reconstruct the original file. File structure carving is primarily aimed towards assembling fragmented files, where header-based carving fails to reconstruct multiple file fragments. An example is semantic carvers (also known as deep carvers) which use information about the internal file structure to control the carving process in some way.
3. Block-Based Carving: An advanced carving technique is block-based carving which calculates meta information of the content of a data block; for example, by implementing character counts or calculating statistical information. The premise is that computer systems use fixed block sizes (sectors) for storing data (usually 512 bytes) and file carvers can examine every block for every file type definition.
4. File Validation: The method of file validation is an integral aspect of the file carving process. Validation provides the confirmation that the carved data actually results in a valid file output. Therefore, an automated format validator is a function that accepts a block of data and then determines whether it conforms to the defined structure of the file format before resulting in a validated file.