lowmanio.co.uk title image

How Internet Explorer stores web history

Tue, 15 Jun 2010 04:52PM

Category: Digital Forensics & Malware

Written by Sarah | No comments

Internet Explorer stores files downloaded from the internet in a cache called Temporary Internet Files (e.g. html pages, images, CSS files). Each cached file is assigned an alphanumeric cache name. Some index.dat files serve to map the cached name with the filename and URL it came from. Other index.dat files store the user’s cookies or web browser history (by default 20 days’ worth). index.dat files are in binary format, and need to be viewed using a hex editor.

There are numerous index.dat files kept on Windows machines. Assuming the computer is running Windows XP, the locations of the main index.dat files are:

C:\Documents and Settings\<UserName>\Local Settings\History\History.IE5\index.dat

(Older history index.dat files can be found in C:\...\History.IE5\MSHist[18digits])

C:\Documents and Settings\<UserName>\Local Settings\Temporary Internet Files\Content.IE5\index.dat

For Windows Vista and Windows 7, the corresponding paths are:

C:\Users\<UserName>\Local\Microsoft\Windows\History\History.IE5\index.dat

C:\Users\<UserName>\Local\Microsoft\Windows\Temporary Internet Files\Content.IE5\index.dat

The index.dat files all have the same format, and comprise of a header followed by a series of records. There are four types of records: HASH, REPR, URL and LEAK. HASH records are indexes to the other three record types, and can be ignored as they are only used internally by Internet Explorer. REPR, URL and LEAK are called activity records, since they each contain information about some sort of online browser activity.

There are a few differences between the various index.dat files. The one stored in the Temporary Internet Files folder (the "cache index.dat" file) is used to relate web files to those cached on the computer, so this additionally stores the names of the cached folders in the file header, and a reference to a corresponding cache folder within each activity record. Other differences will be explained below.

INDEX.DAT FILE HEADERS

The headers contain a small amount of information about the file and, for a cache index.dat, an array of cache folder names.

screenshot

The image above shows the header of an example cache index.dat file. All index.dat files start with "Client UrlCache MMF" followed by the version number, which is shown in red. Next, in blue, is the size of the file. All numbers are stored little-endian. Following on, in yellow, is a pointer to the start of the first record. In this example the next part of the header names four subfolders where the cached files are located – shown in green. In non-cache index.dat files, these would be 0x00 (null) values.

INDEX.DAT FILE CONTENTS

There are three types of activity records. These contain URL information and have the following common structure, illustrated by the image below:

  • TYPE: 4 bytes, either URL, LEAK or REDR. Shown in yellow.
  • LENGTH: 4 bytes, contains the length of the record in 128 byte (0x80) sized blocks. 
  • DATA: variable length, the data we are interested in. Shown in grey. The end of every record is given by a 0x00 character, which can be seen in blue. The rest of the record is just filled with junk.

screenshot

REDR ACTIVITY RECORDS

REDR records contain just a URL and indicate a redirect to a different location.

URL ACTIVITY RECORDS

These are the important records and an example can be seen in the image below. The information held in the DATA section is dependent on the type of index.dat file. They all start with the last modified time (in blue) followed by the last accessed time (in green). Time is stored in Windows FILETIME format (100-nanosecond intervals since 1st January, 1601 UTC).

If the index.dat is a cache file, like that of the image below, the structure follows that of Table 1. If the index.dat is a history file, the structure follows that of Table 2, and looks like the final image.

screenshot

screenshot

Location Meaning
38 bytes in Reference to the cache folder the file is located in. This is just one byte long and is an index into the array of cache folders given in the file header. Shown in dark grey (second to last image).
96 bytes in The URL the file came from (shown in purple). This is followed by the name of the corresponding cached file stored on disk (orange) and finally the HTTP headers (dark blue). Each part starts on a new 16 byte boundary. The Windows username is attached to the end of the HTTP headers.

Table 1 - The DATA structure of a URL activity record in a cache index.dat file

Location Meaning
96 bytes in A URL starting with "Visited: <user>@". This is a URL the user with the login name <user> has visited using their Internet Explorer browser (shown in purple in the last image).

Table 2 - The DATA structure of a URL activity record in a History.IE5 index.dat file

LEAK ACTIVITY RECORDS

LEAK activity records look the same as URL activity records, and are essentially a Microsoft term for an error.

References

Comments

No comments.

Add a comment

captcha