The Path to our File
One Index file just lists the directories or files it contains, providing the information we’ve seen and mainly their entry in the MFT. It’s in those MFT entries that the files and indexes are defined as well as their data location.
- When the Index record is a node (directory) its MFT entry will point to another Index.
- When it’s a file, its MFT entry will contain its data (if it is a very small file) or indexes the location where its data lies, as we’ve seen.
In our case the record 22 contains a reference to a node in the tree (directory) which is in the branch part of our path. Thus, its MFT entry shall point to the external location where the referred Index lies.
As we can see now, the Index files are those who describe the tree nodes content, i.e. those which tell us where the several branches leaving from that node go to.
Let’s verify it it’s like that.
According to what we read in that 22nd record its description is in the 51BEh or 20,926 MFT entry. Let’s translate this into an offset. As we’ve seen in the $MFT file description, where we localized the MFT:
- The MFT 1st part has 16 records which occupy their 4 Clusters, records from 0 to 15.
- The MFT 2nd part begins in the offset 11481000h and starts with record 16.
What we are intending to find is the beginning of the record 20,926 which is in the MFT 2nd part. The MFT 2nd part starts with record 16, as we said. But, as 0 is the first record and the MFT entry 1, the MFT 1st part has 16 records (0 to 15). So, if we have 20,926 MFT entries before this one, in the MFT 2nd part we’ll have 20,926-16=20,910 entries before this one. As each entry uses 1,024 Bytes, its offset shall be 11481000h+146B800h (1,024×20,910)= 128EC800h.
“Todas as Imagens” – MFT Entry Offset 128EC800h
Actually, travelling to that offset, we find there our directory “Todas as Imagens” entry in the MFT, from which we can see the hexadecimal editor representation in figure 41.
At the record header we can read the usual information. Try to read it by yourselves base on what we’ve seen till now.
The 1st attribute is a $STANDARD_INFORMATION, type 10h, resident, no named, providing the usual information about date/time, about the attribute itself and others like keys for other MFT files, as $Secure, $Quota and $UsnJrnl, which we didn’t yet refer to as they don’t have any connection with our purpose.
The 2nd attribute is a $FILE_NAME, type 30h, besides the usual information about itself and the usual date/times, provides us with the file name in the namespace DOS-“TODASA~1”.
The 3rd attribute is a $FILE_NAME, type 30h, identical to the previous one but giving us the file name in the namespace Win32 – “Todas as Imagens”.
The 4th attribute is an “INDEX_ROOT, type 90h, is resident, is named and is a great Index, thus needing external allocation. Its name is $I30, what designates a directory index. About the index we can’t read anything since it is externally allocated. Let’s go to the next attribute.
The 5th attribute is an $INDEX_ALLOCATION, type A0h, non-resident, named, with allocated, real and initial sizes of 1000h, or 1 Cluster. Its name is $I30 and the data run chain which defines and locates it is as follows:
41 01 D2 95 1F 01 00 FF
where we can see that it has only one data run with:
- 1 Byte for the size in clusters, which is 01h
- 4 Bytes for its offset in clusters, 01 1F 95 D2h, i.e. 18,847,186 clusters, corresponding to an offset of 11F95D2000h.
The 6th attribute is a “BITMAP, type Bh0, resident and named, telling us in the shape of a BitMap the used clusters whose value is 01h, so telling that the only cluster is occupied.
The bytes from 128ECA58h to 128ECA5Bh designate the termination of the attribute chain.
Let’s go to the offset 11F95D2000h where we can find the”Todas as Imagens” Directory Index according to the MFT information and see what it indexes and the respective MFT entries.
“Todas as Imagens” – Directory Index Offset 11F95D2000h
We are finally at the index itself, i.e. to the place where are designated all the files and nodes (directories) pointed by this node (directory). Actually, at the designated offset we find the INDEX that defines this node (directory).
We put here together the folder image given by the Windows explorer in figure 42, we recall the next node (directory) in the path is the “Diversos Pessoais” and we put here too the hexadecimal editor representation for the significate part of this cluster in figure 43.
The header beginning with the word INDX, the file signature, defines an index. It occupies 1 cluster.
The 1st record names the directory “Diversos Pessoais” in the namespace Win32, tells us it is described in the 5271h MFT entry and that its Parent directory is in the 51BEh MFT entry, precisely the one where we are coming from.
The 2nd record describes the directory DIVERS~1, the same as before but now how the system automatically transforms the name “Diversos Pessoais” for the namespace DOS.
The 3rd record describes the directory “Empresa” in namespace Win32 and DOS, as the name has less than 8 characters.
The 4th record describes the directory “Fotografias Digitalizadas” in the namespace Win32.
It’s important to notice that what’s written is not “Digitalizadas” but “Digitaliza.as”. Here we can understand the concept of update sequence (06 00 = . .) and update number (64 00 = d .) the one replaced at the end of each sector. In the literal reading made by the hexadecimal editor, which ignores those strange things called by updates we can see what is written there. But the OS, according to the information provided to it by the header, reads the name correctly. Therefore, although in the disk is recorded 06 00 (the update sequence), the OS reads it like 64 00 (the update number) because it knows it is in the limit of a sector and that so being 06 00 is the update sequence.
The 5th record describes the same directory FOTOGR~1 in the namespace DOS.
The 6th record describes the file “SyncToy 114005b4-45e8-4bc6-964a-9edcfbdcb2e2.dat” in the namespace Win32. This is a file created by a synchronizing program.
The 7th record describes the same “SYNCTO~1” in the namespace DOS.
The 8th record describes the directory “TT” in the namespace Win32 and DOS.
The 9th record doesn’t point anything in the MFT, has the length 10h and the flag 2 tells us it is the last one.
Let’s follow our path. Going to the record 1, the one who describes the directory “Diversos Pessoais”, the next node in our path and analyze it in detail. Using the methods we’ve seen before in an index entry analisis we can verify that the record 1 tells us that the next node is described in the MFT entry 5271h or 21,105.
Let’s do the usual calculations. The MFT 2nd part begins at the offset 11481000h. Before the record beginning there there’s always 16 other. So, in order to get the offset where the record we want begins we’ll have to search for the offset resulting 21,105-16=21,089 records after the offset 11481000h (we recall that the entries counting starts at 0 thus the MFT entry 21,105 matches the record 21,106). So, 21,089 x1.024 = 21,595,136 = 1498400h. Adding this value to the initial referred value 11481000h + 1498400h = 12919400h, we get the offset where we have to go in order to proceed with our search. And actually, when we got there we found there what we intended to.
“Diversos Pessoais” – MFT Entry – Offset 12919400h
From now on we will be lot more synthetic in the descriptions, under the penalty to be repeating ourselves. We assume that you are able to do them by yourselves. Let’s follow the description with figure 12-44.
- At the beginning we have the header giving us the usual information about the file,
- the $STANDARD_INFORMATION attribute,
- a $FILE_NAME attribute giving us the name “DIVERS~1” in the namespace DOS,
- a $FILE_NAME attribute giving us the name “Diversos Pessoais” in the namespace Win32,
- an $INDEX_ROOT attribute sending us to an external index allocation,
- an $INDEX_ALLOCATION attribute giving us this node (directory) location,
- a $BITMAP attribute and at last
- the attributes and file Terminator FFFFFFFF.
Let’s analyze the $INDEX_ALLOCATION attribute, as it is the one which gives us the path we must take to get to our destination. It has one data run
41 01 D9 B1 0C 01 00 00
telling us that:
- 1 byte for the size in clusters clusters, which is 01h and
- 4 bytes defining its offset in clusters clusters, which is 010CB1D9h = 17,609,177 clusters or a global offset of 17,609,177 x 8 x 512 = 72,127,188,992 = 10CB1D9000h, the offset where we are just going to.
“Diversos Pessoais” – Directory Index- Offset 10CB1D9000h
There we find the INDX defining the composition of our directory “Diversos Pessoais”, whose Windows Explorer graphic representation we can see in figure 12-45 and whose hexadecimal editor representation we can see in figure 12-46.
- The header with the INDX signature.
- Anuário described in the namespace Win32 and DOS.
- Diversos 1, in the namespace Win32.
- Diversos 2, in the namespace Win32
- Diversos 3, in the namespace Win32
- Diversos 4, in the namespace Win32
- Diversos 1, in the namespace DOS as DIVERS~1
- Diversos 2, in the namespace DOS as DIVERS~2
- Diversos 3, in the namespace DOS as DIVERS~3
- Diversos 4, in the namespace DOS as DIVERS~4
- Viagens in the namespace DOS.
- The Terminator
Let’s look into the directory Diversos 1, which is included in our path. According to what we can read, its description can be found in the MFT entry 528Ah = 21,130 .
Acting as usual, the offset where that record begins will be 11481000h + 149E800h ((21,130-16) x 1.024) = 1291F800h, which is the offset where we are going to in our journey in search for such well hidden treasure.
As the way the OS uses to form names in DOS can induce a misinterpretation of the meaning of the digit set in front of several equal names, we are going to introduce a new concept which will help us to understand how these names in the namespace DOS get here.
Forming Names in DOS
The numbers 1, 2, 3 e 4 shown in the DOS descriptions of the files can induce us into an error as it seems they are the final number in the namespace Win32 description.
But it just seems to be, it isn’t!
Actually, in the namespace Win32 description they are part of the file name. But in the namespace DOS description they aren’t. To understand this we must know how the OS shortens the file name described in the namespace Win32 in order to describe it in the namespace DOS.
When the OS shortens a name from the namespace Win32 to the namespace DOS, it takes the firs 8 characters from its name and puts them in uppercase. If there are 2 or more files where the 8 first characters are equal, then the OS takes the 6 first characters and introduces the signal ~ followed by the order number of that shortened name. Therefore, our directories could have as names “Diversos…anything”, and their shortened namespace DOS name would always be composed by the first 6 characters of their name in uppercase (DIVERS) followed by the symbol ~ and by the number representing the order in which it were found by the OS (among those which are equal).
Now let’s travel to the offset 1291F800h looking for the MFT entry for “Diversos 1”.
“Diversos 1” – MFT Entry- Offset 1291F800h
We are at the MFT entry for “Diversos 1”, having its hexadecimal editor representation in figure 47 where:
At the beginning we have the usual header, with the usual information about the file, the $STANDARD_INFORMATION attribute, a $FILE_NAME attribute giving us the name “DIVERS~1” in the namespace DOS, a $FILE_NAME attribute giving us the name “Diversos Pessoais” in the namespace Win32, an $INDEX_ROOT attribute sending us to an external index allocation, an $INDEX_ALLOCATION attribute giving us this node (directory) localization, a $BITMAP attribute and finally the attributes and file Terminador FFFFFFFF.
When reading the $INDEX_ALLOCATION attribute, the index allocation is given by the only data run
41 01 B3 9E 0A 01 00 00
which tells us:
- 1 byte for its size in clusters, which is 01h and
- 4 bytes defining its offset in clusters, which is 010A9EB3h=17,473,203 clusters or the offset 17,473,203 x 8 x 512 = 7,157,023,9488 = 10A9EB3000h.
And that’s the offset we are going to, in order to know what “Diversos 1” indexes and its entries in the MFT.
“Diversos 1” – Directory Index – Offset 10A9EB3000h
From now on we are going to reduce the Index hexadecimal editor representation to its header, to the description of the directory we are interested in and to its final part as we can see in figure 48.
In the description of the hexadecimal editor representation we can see that:
- The header has the INDX
- The description of “Diversos” in the namespace Win32 and DOS.
- The description of “Diversos 1999” in the namespace Win32.
- A space where will occur the remaining Index descriptions in the necessary namespaces.
- The termination of the Index file.
A total list of the nodes pointed by this one (their child) is shown through its Windows Explorer graphical representation, as in figure 49.
Let’s look into the directory “Diversos 1999” description, the last one in our path (it’s about time), where we can verify that its MFT entry is the 528Bh or 21,131. Doing the same calculations (21,131-16) x 1,024 = 21,621,760 = 149EC00h which added to the offset 11481000h points the offset 1291FC00h as the one where starts our directory description in the MFT.
And this offset 1291FC00h is the one where we are going just now, in order to look to the definitions in the MFT entry for “Diversos 1999“.
“Diversos 1999” – Entrada na MFT – Offset 1291FC00h
One more MFT entry, whose hexadecimal editor representation we can see in figure 50, which we analyze as the previous ones.
- Starting by the header and going through
- the $STANDARD_INFORMATION attribute,
- the $FILE_NAME attribute giving us the name “DIVERS~1” in the namespace DOS,
- the $FILE_NAME attribute giving us the name “Diversos 1999” in the namespace Win32,
- the $INDEX_ROOT attribute telling us about the need of the Index external allocation,
- we get to the $INDEX_ALLOCATION attribute which gives us the localization of this directory Index.
- We have yet the$BITMAP attribute and
- the Terminator FFFFFFFF.
Reading the $INDEX_ALLOCATION attribute, we can see through its only data run
41 01 AE 67 0A 01 00 FF
about the “Diversos 1999” Index data:
- 1 byte for its size in clusters, which is 01h and
- 4 bytes defining the offset in clusters, which is 010A67AEh=17,459,118 clusters, meaning that the offset for its beginning will be 17,459,118 x 512 x 8 = 71,512,547,328 Bytes after the Volume beginning or at the offset 10A67AE000h.
It’s to this offset 10A67AE000h that we are going now, looking for what Diversos 1999 indexes and respective MFT entries.
“Diversos 1999” – Directory Index – Offset 10A67AE000h
In the hexadecimal editor representation of figure 51 we can see:
- The header with the INDX signature,
- the 1st record, corresponding to the file Picture10.jpg in the namespace Win32.
- the record corresponding to the file Picture4.jpg described in the namespace Win32+DOS.
- the last record corresponding to the namespace DOS description of the file PICTUR~1.MIX the same file as mix described in the namespace Win32 in the previous record but non visible in this frame and
- the INDEX Terminator.
All the files referred by this Index can be seen in figure 12-39.
Let’s analyze the description of our file in this INDX.
Besides the usual information about date/times and its own Index entry, we can see that:
- the allocated size for the file is 54F000h = 5,566,464 Bytes or 1359 clusters (always a multiple in clusters)
- the real size of the file is 54E8DFh = 5,564,639 Bytes,
- its MFT entry, where its localization in the Volume is registered, is the 528Eh = 21,134.
Using the usual calculations (21,134-16) x 1.024 = 21,624,832 = 149F800h which once added to 11481000h tells us the offset where that MFT entry begins, which is 12920800h. And that’s where we are going to looking for our file composition and localization.
“Picture4.jpg” – MFT Entry – Offset 12920800h
Since now we are dealing with a real file, the one we are looking for, we are going to analyze in detail its hexadecimal editor representation in figure 12-52.
The header tells us:
- that this entry has the FILE signature,
- that the offset for the update sequence is 30h,
- that the update sequence is compose by 3 words,
- the $LogFile sequence number,
- the sequence number, 01 00,
- the hard links count, which is 1,
- the offset for the 1st attribute, which is 38h,
- the real size for this record, 160h = 352 Bytes and
- the allocated size for it, 400h = 1.024 Bytes.
The 1st attribute is a $STANDARD_INFORMATION,
- with the type 10h,
- the size of 60h,
- is resident,
- no named,
- this attribute data has 48h of length and tells us that this file:
- was created the 25/04/2010 at 23:06:08 (the creation date is the one when it was created in this volume),
- was modified the 11/11/2002 at 13:14:46 (this is the date/time for the last modification),
- had its MFT entry modified the 25/04/2010 at 23:06:08 (the MFT modification date/time refer to the last time it was modified and forced a modification in its MFT entries),
- was last accessed the 25/04/2010 at 23:06:08 (the last access date/time refer to the last time it was accessed and not modified),
- that its DOS permissions are Archive.
- the offset to them is 18h.
The 2nd attribute is a $FILE_NAME,
- with the type 30h,
- refers this file Parent directory is in the MFT entry 528Bh (actually the entry for “Diversos 1999” as we can see above)
- all the date/time are as before,
- its allocated size is 5,566,464 or 1.359 clusters, has we had seen before and
- it has the name “Picture4.jpg”, described in the namespace Win32 and DOS.
The 3rd attribute is a $DATA,
- with the type 80h,
- no named,
- with initial VCN 0 and
- final VCN 54Eh =1.358,
- with another reference to the allocated size of 54F000h = 5.566.464 (1.359 clusters),
- to the real size of 54E8DFh = 5.564.639 Bytes and
- same initial data chain size.
The data run (only one) is
42 4F 05 5F 62 0A 01 00 00
- 2 Bytes defining its size in clusters, those which come after the byte 42 which are 054Fh = 1,359
- 4 bytes defining its offset in clusters, those which come after the 2 bytes 4F 05 which are 010A625Fh = 1,091,109 clusters, or global offset 10A625000h.
This information tells us that the file we are looking for is not fragmented and uses 1.359 clusters after the offset 10A625000h.
Now it’s easy to understand that, once found the file entry in the MFT, we’ll be provided with all the information about its localization in the Volume, fragmented or not. Only in the very exceptional case where the data run chain doesn’t fit in the space provided by the MFT file entry (very large and largely fragmented file) this won’t be true.
We won’t look at the hexadecimal editor representation for the file as what we’ll find there is the file code, not understandable for us.