lowmanio.co.uk title image

Facebook Chat Forensics

Sun, 03 Oct 2010 05:54PM

Category: Digital Forensics & Malware

Written by Sarah | With 3 comments

Many parts of Facebook such as chat, messaging and posting statuses are written in Javascript/AJAX. This requires a lot of calls to the server to constantly have the most up-to-date information. To speed things up, Facebook stores some of the AJAX data in temporary files on the person's computer. These files can contain valuable forensic data. In particular, Facebook stores some chat messages in individual files. I say some because caching may not be invoked if the person does not move away from the Facebook homepage at all.  However any movement to other Facebook pages should cause the caching.

How to extract Internet Explorer Facebook chat messages using EnCase can be found here and JADsoftware has a tool that finds all browser chat messages (free version is limited to 10 messages). Following the EnCase guide I managed to make my own script that finds all Facebook chat messages sent and recieved via Internet Explorer. I'm not sure how Firefox caching works and cannot find any references to Facebook chat yet, but if I do I shall update the script. 

Basically, the script looks inside every folder in the temporary internet files folder for files that are named something like p_[numbers][characters/numbers].txt or .htm, such as p_61003300=10[1].txt or p_61003300=0CAX1XPU0.txt. Each file that matches is then checked to see if it is in the correct Facebook chat format. If so, the useful parts are extracted and added to a CSV file. The file is also copied over to a results folder.

To use, just change the first line at the top to point to the temporary internet files folder you want to investigate. You can find out where yours is by Googling 'location of temporary internet files folder in <operating system>'

Download the script here.

import os
import re
import shutil
import csv
from datetime import datetime

# change this to where the temporary internet files are kept
temp_internet_files_loc = r"c:\Users\Sarah\AppData\Local\Microsoft\Windows\
Temporary Internet Files"

# regular expressions to make facebook chat files
facebook_reg = re.compile(r'\Ap\_(\d)+(\S)*\.(txt|htm)\Z')
message_ref = re.compile(r'\Afor \(\;\;\)\;{"t":"msg","c":"\S+","s":\d+,

total_found = 0
results = None

def check_format(file):
   Opens the file given and checks it matches the Facebook chat 
   regular expression. If so, returns the Facebook chat fields.
   with open(file, 'r') as f:
      line = f.read()
      result = message_ref.match(line)
      if result is not None:	
	  message, time, clientTime, msgID, from_, to, from_name, \
             from_first_name, to_name, to_first_name = \
	  return message, time, clientTime, from_, to, from_name, to_name
	  return None

if __name__ == "__main__":
   # get the full path of the folder results will be stored in
   dest = raw_input("\nPath of folder to write results to: ")
   if os.path.exists(dest):
      # open a CSV file and write Facebook chat headers
      results = csv.writer(open(os.path.join(dest, 'facebook_chat_messages.csv'), 
          'wb'), delimiter=',',quotechar='"', quoting=csv.QUOTE_MINIMAL)
      results.writerow(['UTC Time', 'UTC Formatted Time', 'Users Time', 
          'Users Formatted Time', 'From FB ID', 'From Name', 'To FB ID', 
          'To Name', 'Message Sent'])

      # walk through the temporary internet files folder and look at each file
      for root, dirs, files in os.walk(temp_internet_files_loc):
         for file in files:
	 # if it matches the facebook file name format:
	    if facebook_reg.match(file):
	      src = os.path.join(root, file)
	      msg_tuple = check_format(src) # check the file is a facebook chat file
	      if msg_tuple is not None:
	         # format date/times correctly
		 client_time = datetime.fromtimestamp(float(msg_tuple[2])/1000.00)\
                              .strftime("%d/%m/%Y %H:%M:%S")
		 time = datetime.fromtimestamp(float(msg_tuple[1])/1000.00)\
                        .strftime("%d/%m/%Y %H:%M:%S")
		 # write a row to the CSV file
		 results.writerow([float(msg_tuple[2])/1000.00, time, 
                                   float(msg_tuple[2])/1000.00, client_time, 
                                   msg_tuple[3], msg_tuple[5], msg_tuple[4], 
                                   msg_tuple[6], msg_tuple[0]])
		 dst = os.path.join(dest, file)
		 # copy the original Facebook file to results folder
		 shutil.copy(src, dst)
		 total_found = total_found + 1
      print "\nTotal facebook chat messages found: %s\n" % total_found
      print r"Messages found and a summary CSV file added to: %s" % dest
      print "\nInvalid folder! Exiting."


Nice script and thx for sharing. 

But this is JSON we are talking about. What happens when the format changes? (I have personally seen about 10 different ways of storing the format.) Im not very strong within python, but from I can see, the script only takes one if the many different formats. Am i right?

Rasmus Riis
Tue, 19 Apr 2011 12:50PM
As far as I know this is how FB stores the info. The script relies on FB keeping the format the same, but if it changes then the regular expression can be updated.
Sat, 07 May 2011 06:57PM
How difficult would this be, to modify for Firefox/Chrome? Or is there anything similar for Chrome/Firefox/Safari?
Fri, 25 May 2012 06:31PM

Add a comment