
No two data recovery situations are alike. Many times, it’s possible to completely recover lost files from a disk, including the original filenames and folder structure. Other times, the files and data may be recovered, but the filenames, date/timestamps and folder paths are lost. And in some cases, no intact files can be found. This raises a common question from our users: why?
To help contextualize the answer to this question, it helps to gain a basic understanding of how files are stored on the disk and how they can be recovered. While professional data recovery usually requires years of experience and deep knowledge of the technical nuances of file systems and disk physics, learning the basics can help you set reasonable expectations for your data recovery software.
In this article, we’ll take a very high level look at how file recovery works. We’ll also show you how to apply this knowledge to a few common scenarios in order to estimate your chances for a successful file recovery.How files are stored on the disk
An important article for file recovery from SSD devices: File Recovery Specifics for SSD devices.
To understand how files can be recovered from a disk, it helps to understand how files are stored on hard disks before they are lost.
Most modern operating systems divide (or “partition”) the entire physical hard drive into one or several independent parts (“partitions”). In the DOS/Windows-based OS families, these partitions are called “logical disks”. Logical disks are assigned drive letters and optional descriptive labels. For example, C: (System) or D: (Data). Each partition has its own file system type, independent from other partitions on the same physical disk. For example, a physical hard drive for a Windows system may contain two logical disks: one NTFS and another FAT32. Information about the partitions on the disk is stored at the beginning of the hard drive. This is usually referred to as a “partition table” or “partition map.”
A typical partition structure is shown in Figure 1.
Figure 1: Hard drive structure
Click image to enlarge
The hard drive service data and info about partition structure portion shown in Figure 1 is known as “meta-data.” That is, information about the data on the disk (as opposed to the data itself). Similarly, each partition or logical disk is divided into two parts: one stores information about the disk (folder structure, file system, etc.) and the other stores the data that comprises those files. This division of data from meta-data allows for better disk space management, faster file search and increased reliability.
Figure 2 shows a typical logical disk structure.
Figure 2: Logical disk structure
Click image to enlarge
The disk service information shown in Figure 2 contains specific information about the partition size, file system type, etc. This is necessary for the computer to correctly find the necessary data on the partition.
The info about files and folders contains file records that store filenames, sizes, date/times, and other technical information. This information also includes the exact physical locations (addresses) of the file data on the disk. This information is usually backed up on the drive itself, in case the first copy becomes corrupted.
Various file systems have different forms of storing this information. For example, the FAT file system stores this info in a File Allocation Table (FAT), whereas the NTFS file system stores it in a Master File Table (MFT).
When a computer needs to read a file, it first goes to the info about files and folders and searches for the record of that file. Then, it looks up the address of the file, goes to the specified place on the disk, and then reads the file data.
For contiguous files, where the data is grouped together on the disk, this process is very straightforward. However, files on the disk may be fragmented. That is, they may occupy several non-adjacent disk areas. This is more common than most users realize. After all, when you view a file from Windows Explorer or Finder, it is always represented as a single file. This is because the file system is doing all the work of piecing together the fragments behind the scenes. The info about files and folders stores the addresses of each fragmented piece of data so they can be quickly and reliably retrieved when the computer needs to read the file. This information and how it’s retrieved plays an important role in file recovery.
When a computer wants to delete a file, it doesn’t immediately destroy its data. Instead, it makes some changes to the info about files and folders to designate that the file has been deleted. Some operating systems simply mark the file as deleted, retaining all the meta-data about the file until it becomes necessary to overwrite it with meta-data about a new file. This is how Windows file systems handle deletions. Other operating systems, like Mac OS X, completely destroy the file record of the deleted file. While operating systems vary in whether they preserve or delete the info about files and folders immediately, all operating systems leave the actual file data untouched until it becomes necessary to allocate that disk space for another file. If no files are going to be written to the disk, the data information about the file and its data may remain forever.
As noted above, the portion of the disk that stores the file data also contains a backup copy of the info about files and folders. This part of the disk may also contain some additional pieces of information about file and folder structure scattered across the entire disk.File recovery methods
Before we discuss the different methods for file recovery, it’s important to note one thing:
If the data on the disk is overwritten, then the old data is gone. No program or commercially available data recovery method can recover it.
This is why it’s of the utmost importance that no new files are written to a disk prior to attempting a data recovery.
For files that have not been overwritten, there are two file recovery methods. All data recovery software use one or both of these techniques.Method One: File recovery through analysis of the info about files and folders
This is the first method that a file recovery program attempts to perform. This is because it can recover files with their original names, paths, date/time stamps, and their data (if successful).
The file recovery software starts by trying to read and process the first copy of the info about files and folders. In some cases (such as accidental file deletion), this is the only step that needs to be taken in order to recover the files in their entirety.
If the first copy of the info about files and folders is severely damaged, the software scans the disk for the second copy of the info about files and folders. It also attempts to glean additional information about the folders and files structure that may be on the data part of the disk. Then, it processes all this information to reconstruct the original folders and file structure.
If the file system on the disk isn’t severely damaged, it is often possible to recover the entire file and folder structure.
If the file system on the disk is severely damaged, this recovery method cannot recreate the entire folder structure. Then recovered files will appear in “orphaned” folders.