There is an update to this post. It can be found after the ‘Conclusion’ section.
I was recently tasked with examining a two-year old Android-based phone which required an in-depth look at Snapchat. One of the things that I found most striking (and frustrating) during this examination was the lack of a modern, in-depth analysis of the Android version of the application beyond the tcspahn.db file, which, by the way, doesn’t exist anymore, and the /cache folder, which isn’t really used anymore (as far as I can tell). I found a bunch of things that discussed decoding encrypted media files, but this information was years old (Snapchat 5.x). I own the second edition of Learning Android Forensics by Skulkin, Tyndall, and Tamma, and while this book is great, I couldn’t find where they listed the version of Snapchat they examined or the version of Android they were using; what I found during my research for this post did not really match what was written in their book. A lot of things have changed.
Googling didn’t seem to help either; I just kept unearthing the older research. The closest I got was a great blog post by John Walther that examined Snapchat 10.4.0.54 on Android Marshmallow. Some of John’s post lined up with what I was seeing, while other parts did not.
WHAT’S THE BIG DEAL?
Snapchat averages 190 million users daily, which is just under half of the U.S. population, and those 190 million people send three billion snaps (pictures/videos) daily. Personally, I have the app installed on my phone, but it rarely sees any usage. Most of the time I use it on my kid, who likes the filters that alter his voice or requires that he stick out his tongue. He is particularly fond of the recent hot dog filter.
One of the appealing things about Snapchat is that direct messages (DMs) and snaps disappear after a they’re opened. While the app can certainly be used to send silly, ephemeral pictures or videos, some people find a way to twist the app for their own nefarious purposes.
There has been plenty written in the past about how some traces of activity are actually recoverable, but, again, nothing recent. I was surprised to find that there was actually more activity-related data left behind than I thought.
Before we get started just a few things to note (as usual). First, my test data was generated using a Pixel 3 running Android 9.0 (Pie) with a patch level of February 2019. Second, the version of Snapchat I tested is 10.57.0.0, which was the most current version as of 05/22/2019. Third, while the phone was not rooted, it did have TWRP, version 3.3.0-0, installed. Extracting the data was straight forward as I had the Android SDK Platform tools installed on my laptop. I booted into TWRP and then ran the following from the command line:
adb pull /data/data/com.snapchat.android
That’s it. The pull command dropped the entire folder in the same path as where the platform tools resided.
As part of this testing, I extracted the com.snapchat.android folder five different times over a period of 8 days as I wanted to see what stuck around versus what did not. I believe it is also important to understand the volatility of the data that is provided in this app. I think understanding the volatility will help investigators in the field and examiners understand exactly how much time, if any, they have before the data they are seeking is no longer available.
I will add that I tested two tools to see what they could extract: Axiom (version 3.0) and Cellebrite (UFED 4PC 7.18 and Physical Analyzer 7.19). Both tools failed to extract (parsing not included) any Snapchat data. I am not sure if this is a symptom of these tools (I hope not) or my phone. Regardless, both tools extracted nothing.
TWO SNAPS AND…SOME CHANGE
So, what’s changed? Quite a bit as far as I can tell. The storage location of where some of the data that we typically seek has changed. There are enough changes that I will not cover every single file/folder in Snapchat. I will just focus on those things that I think may be important for examiners and/or investigators.
One thing has not changed: the timestamp format. Unless otherwise noted, all timestamps discussed are in Unix Epoch.
The first thing I noticed is that the root level has some new additions (along with some familiar faces). The folders that appear to be new are “app_textures”, “lib”, and “no_backup.” See Figure 1.
The first folder that may be of interest is one that has been of interest to forensicators and investigators since the beginning: “databases.” The first database of interest is “main.db.” This database replaces tcspahn.db as it now contains a majority of user data (again, tcspahn.db does not exist anymore). There is quite a bit in here, but I will highlight a few tables. The first table is “Feed.” See Figure 2.
This table contains the last action taken in the app. Specifically, the parties involved in that action (seen in Figure 2), what the action was, and when the action was taken (Figure 3). In Figure 4 you can even see which party did what. The column “lastReadTimestamp” is the absolute last action, and the column “lastReader” show who did that action. In this instance, I had sent a chat message from Fake Account 1 (“thisisdfir”) to Fake Account 2 (“hickdawg957”) and had taken a screenshot of the conversation using Fake Account 1. Fake Account 2 then opened the message.
Note that the timestamps are for when I originally added the friend/the friend added me. The timestamps here translate back to dates in November of 2018, which is when I originally created the accounts during the creation of my Android Nougat image.
One additional note here. Since everyone is friends with the “Team Snapchat” account, the value for that entry in the “addedTimestamp” column is a good indicator of when the account you’re examining was created.
The next table is a biggie: Messages. I will say that I had some difficulty actually capturing data in this table. The first two attempts involved sending a few messages back and forth, letting the phone sit for a 10 or so minutes, and then extracting the data. In each of those instances, absolutely NO data was left behind in this table.
In order to actually capture the data, I had to leave the phone plugged in to the laptop, send some messages, screenshot the conversation quickly, and then boot into TWRP, which all happened in under two minutes time. If Snapchat is deleting the messages from this table that quickly, they will be extremely hard to capture in the future.
Figure 8 is a screenshot of my conversation (all occurred on 05/30/2019) taken with Fake Account 1 (on the test phone) and Figure 9 shows the table entries. The messages on 05/30/2019 start on Row 6.
The columns “timestamp” and “seenTimestamp” are self-explanatory. The column “senderId” is the “id” column from the Friends table. Fake Account 1 (thisisdfir) is senderId 2 and Fake Account 2 (hickdawg957) is senderId 1. The column “feedRowId” tells you who the conversation participants are (beyond the sender). The values link back to the “id” column in the Feed table previously discussed. In this instance, the participants in the conversation are hickdawg957 and thisisdifr.
In case you missed it, Figure 8 actually has two saved messages between these two accounts from December of 2018. Information about those saved messages appear in Rows 1 and 2 in the table. Again, these are relics from previous activity and were not generated during this testing. This is an interesting find as I had completely wiped and reinstalled Android multiple times on this device since the those messages were sent, which leads me to speculate these messages may be saved server-side.
In Figure 10, the “type” column is seen. This column shows the type of message was transmitted. There are three “snap” entries here, but, based on the timestamps, these are not snaps that I sent or received during this testing.
In Figure 12, I bring up one of the messages that I recently sent.
The next table is “Snaps.” This table is volatile, to say the least. The first data extraction I performed was on 05/22/2019 around 19:00. However, I took multiple pictures and sent multiple snaps on 05/21/2019 around lunch time and the following morning on 05/22/2019. Overall, I sent eight snaps (pictures only) during this time. Figure 13. Shows what I captured during my first data extraction.
On 05/25/2019 I conducted another data extraction after having received a snap and sending two snaps. Figure 14 shows the results.
Figure 15 shows the next table, “SendToLastSnapRecipients.” This table shows the user ID of the person I last sent a snap to in the “key” column, and the time at which I sent said snap.
During the entire testing period I took a total of 13 pictures. Of those 13, I saved 10 of them to “Memories.” Memories is Snapchat’s internal gallery, separate from the phone’s Photos app. After taking a picture and creating an overlay (if desired), you can choose to save the picture, which places it in Memories. If you were to decide to save the picture to your Photos app, Snapchat will allow you to export a copy of the picture (or video).
And here is a plus for examiners/investigators: items placed in Memories are stored server-side. I tested this by signing into Fake Account 1 from an iOS device, and guess what…all of the items I placed in Memories on the Pixel 3 appeared on the iOS device.
Memories can be accessed by swiping up from the bottom of the screen. Figure 16 shows the Snapchat screen after having taken a photo but before snapping (sending) it. Pressing the area in the blue box (bottom left) saves the photo (or video) to Memories. The area in the red box (upper right) are the overlay tools.
Figure 17 shows the pictures I have in my Memories. Notice that there are only 9 pictures (not 10). More on that in a moment.
The database memories.db stores relevant information about files that have been saved to Memories. The first table of interest is “memories_entry.” This table contains an “id,” the “snap_id,” and the date the snap was created. There are two columns regarding the time: “created_time” and “latest_created_time.” In Figure 18 there is a few seconds difference between the values in some cells in the two columns, but there are also a few that are the same value. In the cells where there are differences, the differences are negligible.
There is also a column titled “is_private” (seen in Figure 19). This column refers to the My Eyes Only (MEO) feature, which I will discuss shortly. For now, just know that the value of 1 indicates “yes.”
(FOR) MY EYES ONLY
I have been seeing a lot of listserv inquires as of late regarding MEO. Cellebrite recently added support for MEO file recovery in Android as of Physical Analyzer 7.19 (iOS to follow), and, after digging around in the memories database, I can see why this would be an issue.
MEO allows a user to protect pictures or videos with a passcode; this passcode is separate from the user’s password for their Snapchat account. A user can opt to use a 4-digit passcode, or a custom alphanumeric passcode. Once a user indicates they want to place a media file in MEO, that file is moved out of the Memories area into MEO (it isn’t copied to MEO).
MEO is basically a private part of Memories. So, just like everything else in Memories, MEO items are also stored server-side. I confirmed this when I signed in to Fake Account 1 from the iOS device; the picture I saved to MEO on the Pixel 3 appeared in MEO on the iOS device. The passcode was the same, too. Snapchat says if a user forgets the passcode to MEO, they cannot help recover it. I’m not sure how true that is, but who knows.
If you recall, I placed 10 pictures in Memories, but Figure 17 only showed 9 pictures. That is because I moved one picture to MEO. Figure 20 shows my MEO gallery.
In the memories database, the table “memories_meo_confidential” contains entries about files that have been placed in MEO. See Figure 21.
This table contains a “user_id,” the hashed passcode, a “master_key,” and the initialization vector (“iv”). The “master_key” and “initialization vector” are both stored in base64. And, the passcode….well, it has been hashed using bcrypt (ugh). I will add that Cellebrite reports Physical Analyzer 7.19 does have support for accessing MEO files, and, while I did have access to 7.19, I was not able to tell if it was able to access my MEO file since it failed to extract any Snapchat data.
The “user_id” is interesting: “dummy.” I have no idea what that is referring to, and I could not find it anywhere else in the data I extracted.
The next table is “memories_media.” This table. Does have a few tidbits of interesting data: another “id,” the size of the file (“size”), and what type of file (“format”). Since all of my Memories are pictures, all of the cells show “image_jpeg.” See Figures 22 and 23.
The next table is “memories_snap.” This table has a lot of information in about my pictures, and brings together data from the other tables in this database. Figure 24 shows a column “media_id,” which corresponds to the “id” in the “memories_media” table discussed earlier. There is also a “creation_time” and “time_zone_id” column. See Figure 24.
Figure 25 shows the width and height of the pictures. Also note the column “duration.” The value is 3.0 for each picture. I would be willing to be that number could be higher or lower if the media were videos.
Figure 25 also shows the “memories_entry_id,” which corresponds to the “id” column in the “memories_entry” table. There is also a column for “has_location.” Each of the pictures I placed in Memories has location data associated with it (more on that in a moment).
Figure 26 is interesting as I have not been able to find the values in the “external_id” or “copy_from_snap_id” columns anywhere.
The data seen in Figure 27 could be very helpful in situations where an examiner/investigator thinks there may be multiple devices in play. The column “snap_create_user_agent” contains information on what version of Snapchat created the the snap, along with the Android version and, in my case, my phone model.
The column “snap_capture_time” is the time I originally took the picture and not the time I sent the snap.
Figure 28 shows information about the thumbnail associated with each entry.
Figure 29 is just like Figure 27 in its level of value. It contains latitude and longitude of the device when the picture was taken. I plotted each of these entries and I will say that the coordinates are accurate +/- 10 feet. I know the GPS capabilities of every device is different, so just be aware that your mileage may vary.
Figure 29 also has the column “overlay_size.” This is a good indication if a user has placed an overlay in the picture/video. Overlays are things that are placed in a photo/video after it has been captured. Figure 30 shows an example of an overlay (in the red box). The overlay here is caption text.
If the value in the overlay_size column is NULL that is a good indication that no overlay was created.
Figure 31 shows the “media_key” and “media_iv,” both of which are in base64. Figure 32 shows the “encrypted_media_key” and “encrypted_media_iv” values. As you can see there is only one entry that has values for these columns; that entry is the picture I placed in MEO.
The next table that may be of interest is “memories_remote_operation.” This shows all of the activity taken within Memories. In the “operation” column, you can see where I added the 10 pictures to Memories (ADD_SNAP_ENTRY_OPERATION). The 11th entry, “UPDATE_PRIVATE_ENTRY_OPERATION,” is where I moved a picture into MEO. See Figure 33.
The column “serialized_operation” stores information about the operation that was performed. The data appears to be stored in JSON format. The cell contains a lot of the same data that was seen in the “memories_snap” table. I won’t expand it here, but DB Browser for SQLite does a good job of presenting it.
Figure 34 shows a better view of the column plus the “created_timestamp” column. This is the time of when the operation in the entry was performed.
Figure 35 contains the “target_entry” column. The values in these columns refer to the “id”column in the “memories_entry” table.
To understand the next database, journal, I first have to explain some additional file structure of the com.snapchat.android folder. If you recall all the way back to Figure 1, there was a folder labeled “files.” Entering that folder reveals the folders seen in Figure 36. Figure 37 shows the contents of the “file_manager” folder.
The first folder of interest here is “media_package_thumb,” the contents of which can be seen in Figure 38.
Examining the first file here in hex finds a familiar header: 0xFF D8 FF E0…yoya. These things are actually JPEGs. So, I opened a command line in the folder, typed ren *.* *.jpg and BAM: pictures! See Figure 39.
Notice there are a few duplications here. However, there are some pictures here that were not saved to memories and were not saved anywhere else. As an example, see the picture in Figure 40.
The next folder of interest is the “memories_media” folder. See Figure 41.
There are 10 items here. These are also JPEGs. I performed the same operation here as I did in the “media_package_thumb” folder and got the results seen in Figure 42.
These are the photographs I placed in Memories, but the caption overlays are missing. The picture that is MEO is also here (the file staring with F5FC6BB…). Additionally, these are high resolution pictures.
You may be asking yourself “What happened to the caption overlays?” I’m glad you asked. They are stored in the “memories_overlay” folder. See Figure 43.
Just like the previous two folders, these are actually JPEGs. I performed the rename function, and got the results seen in Figure 44. Figure 45 shows the overlay previously seen in Figure 30.
The folder “memories_thumbnail” is the same as the others, except it contains just the files in Memories (with the overlays). For brevity’s sake, I will just say the methodology to get the pictures to render is the same as before. Just be aware that while I just have pictures in my Memories, a user could put videos in there, too, so you could have a mixture of media. If you do a mass-renaming, and a file does not render, the file extension is probably wrong, so adjust the file extension(s) accordingly.
Now that we have discussed those file folders, let’s get back to the journal database. This database keeps track of everything in the “file_manager” directory, including those things we just discussed. Figure 46 shows the top level of the database’s entries.
If I filter the “key” column using the term “package” from the “media_package_thumb” folder (the “media_package_thumb.0” files) I get the results seen in Figure 47.
The values in the “key” column are the file names for the 21 files seen in Figure 38. The values seen in the “last_update_time” column are the timestamps for when I took the pictures. This is a method by which examiners/investigators could potentially recover snaps that have been deleted.
WHAT ELSE IS THERE?
As it turns out, there are a few more, non-database artifacts left behind which are located in the “shared_prefs” folder seen in Figure 1. The contents can be seen in Figure 48.
The first file is identity_persistent_store.xml seen in Figure 49. The file contains the timestamp for when Snapchat was installed on the device (INSTALL_ON_DEVICE_TIMESTAMP), when the first logon occurred on the device (FIRST_LOGGED_IN_ON_DEVICE_TIMESTAMP), and the last user to logon to the device (LAST_LOGGED_IN_USERNAME).
Figure 50. shows the file LoginSignupStore.xml. it contains the username that is logged in.
The file user_session_shared_pref.xml has quite a bit of account data in it, and is seen in Figure 51. For starters, it contains the display name (key_display_name), the username (key_username), and the phone number associated with the account (key_phone).
The value “key_created_timestamp” is notable. This time stamp converts to November 29, 2018 at 15:13:34 (EST). Based on my notes from my Nougat image, this was around the time I established Fake Account 1, which was used in the creation of the Nougat image. This might be a good indicator of when the account was established, although, you could always get that data from serving Snapchat with legal process.
Rounding it out is the “key_user_id” (seen in the Friends table of the main database) and the email associated with the account (key_email).
Snapchat’s reputation proceeds it very well. I have been in a few situations where examiners/investigators automatically threw up their hands and gave up after having been told that potential evidence was generated/contained in Snapchat. They wouldn’t even try. I will say that while I always have (and will) try to examine anything regardless of what the general concensus is, I did share a bit of others’ skepticism about the ability to recover much data from Snapchat. However, this exercise has shown me that there is plenty of useful data left behind by Snapchat that can give a good look into its usage.
Alexis Brignoni over at Initialization Vectors noticed that I failed to address something in this post. First, thanks to him for reading and contacting me. 🙂 Second, he noticed that I did not address Cellebrite Physical Analyzer’s (v 7.19) and Axiom’s (v 3.0) ability to parse my test Snapchat data (I addressed the extraction portion only).
We both ran the test data against both tools and found both failed to parse any of the databases. Testing found that while Cellebrite found the pictures I describe in this post, it did not apply the correct MAC times to them (from the journal.db). Axiom failed to parse the databases and failed to identify any of the pictures.
This is not in any way shape or form a knock on or an attempt to single out these two tools; these are just the tools to which I happen to have access. These tools work, and I use them regularly. The vendors do a great job keeping up with the latest developments in both the apps and the operating systems. Sometimes, though, app developers will make a hard turn all of a sudden, and it does take time for the vendors to update their tools. Doing so requires R&D and quality control via testing, which can take a while depending on the complexity of the update.
However, this exercise does bring to light an important lesson in our discipline, one that bears repeating: test and know the limitations of your tools. Knowing the limitations allows you to know when you may be missing data/getting errant readings. Being able to compensate for any shortcomings and manually examine the data is a necessary skillset in our discipline.
Thank you Alexis for the catch and assist!