IT: Special Case File Recovery - File Smaller Than Stripe Size

The scenario: A small but critical file has been deleted from a RAID set. Standard file recovery tools are not viable or don't exist. Can the file still be recovered? Assuming that you know a fragment of the file, and the file is small enough to fit within a stripe, then it may be possible.

In my case, I received a call where someone deleted the last file system snapshot containing the file before realizing it was needed. Luckily, a reasonably unique string within the file was known, and the file was known to be much smaller than the RAID stripe size (128KB) so I set about writing a utility to scan a drive for the string and save all candidate stripes to a separate drive.

The basic idea was to save all possibilities as quickly as possible so the RAID set could resume read-write operation while recovery continued elsewhere. The entire stripe is saved on the belief that the actual file can be pruned from it at the user's leisure.

In this case, the file system was a ZFS pool, but it shouldn't matter as long as the media is accessible and not encrypted. Truth be told, I'm tempted to try and write a version that could read a contiguous file over several stripes, but I haven't invested the time yet.

Now that the background is out of the way, how does one use it? CAREFULLY

No, seriously. The idea is to tweak the code as needed, compile, and point it at a disk like /dev/ad1s1e under FreeBSD. Because this thing is accessing a raw disk, there is always some concern that things will go wrong so this is a use at your own risk situation.

Anyway, the tweaking:

#define STREAMOFFSET 0
#define BLOCKSIZE 131072
#define REPORTNBLOCKS 4096
char * key = "magic key sequence";

Notice the three "#define" statements near the top of the code. these are some common things to alter.

  • The STREAMOFFSET is how many bytes from the start of the device/disk/file the striping begins.
  • The BLOCKSIZE is the size of a stripe in bytes.
  • REPORTNBLOCKS is a lazy progress indicator. After this number of blocks has passed, a line will be written to standard output.

The last and most important thing to alter is the "key". If the known segment of the file is just an ASCII string, the code will work as written. If the fragment doesn't fit this condition, you'll need to do additional work.

AttachmentSize
StreamSearch.cpp2.31 KB