New msgstore – Who ‘Dis? A Look At An Updated WhatsApp On Android

6j6jb8

I decided to take a month off from extracurricular DFIR activities after the Magnet User Summit. It was nice not having to carry more than one phone around, to put down the notepad & pen, and to just shutdown at the end of the day. The good (and bad) thing about our discipline is that one can take a break whenever it’s needed and there will be plenty to do when the break is over. So, don’t ever feel bad about taking some time away and letting your brain recharge.

However, just because I was taking a break does not mean I was not paying attention. I started to notice a slight uptick in questions about WhatsApp, and a few folks on the Discord server noticed that some tools were no longer parsing messages or linking message attachments on more recent versions of the app on Android. Tool failure happens from time to time. App developers will change something, and an examiner’s tools no longer function as they had previously. When this happens we are left on our own to figure out what has changed and to come up with new ways to get the data we are after. That is why it is important to understand the underlying data so that we not only can compensate for any changes in the data, but also know when a tool gives us an errant reading.

As I tend to do when I see complaints about tools failing to do something with a previously supported app, I decided to take a look at WhatsApp on Android to see what changed. If you are interested in recent versions of WhatsApp on iOS, you will need to wait until August when Stacey Randolph and Chris Vance give their presentation at the 2022 SANS DFIR Summit.

You May Dispense With The Pleasantries

WhatsApp is ubiquitous, so I will skip the background typically seen in my posts and, as they say in Monty Python, get on with it.

For testing, I used my new Pixel 5a, which was running Android 12 with the March 2022 security update. WhatsApp version 2.22.11.82 was used and was the latest version as of the time of this post.  In order to get to this data an examiner’s extraction tool will need to have the ability to access the USERDATA/data/ directory path.

The biggest change, and arguably the most important, is how WhatsApp is storing message content. Like previous versions of WhatsApp, message content is stored in the database msgstore.db, which is found in USERDATA/data/com.whatsapp/databases/, but that is where the similiarities stop.  See Figures 1 & 2.

Figure 1
Figure 1.  msgstore.db from the public Android 12 image (WhatsApp version 2.21.20.20).

Figure 2
Figure 2.  msgstore.db from the Pixel 5a.

As can be seen, the table name that holds message content (red boxes) has changed along with column name (blue boxes). Previous versions of WhatsApp stored message content in the table messages, but this new version is storing it in the table message (singular). The interesting thing is that the version of msgstore.db from the public Android 12 image does have a table called message, but it contains no message content. The column name has changed, too, from data to text_data.

The messages table in the older version of msgstore.db could contain a lot of data depending on how WhatsApp was used. In addition to message content and associated timestamps, it could also contain location data, information about transferred files, and data about other parties in a conversation. In the new version of msgstore.db the table message still contains message content (column text_data – seen in blue box Figure 2) but only references to those other mentioned data points, information about which are contained in other tables. Information was scatterd among other tables in the older version, too, but it did make for some duplication of data as the same data was found in multiple tables. I suspect Meta simply de-duplicated the data. Regardless, most things examiners would be interested in are no longer contained in a single table. The other tables that are relevant for message content are chat (Figure 3) and jid (Figure 4). The former will contain information about any group chats and the latter will contain information about contacts/chat participants.

Figure 3
Figure 3.  chat table.

Figure 4
Figure 4.  jid table.

Other tables of interest include message_location (contains GPS locations when location is shared or received), message_media (information about transferred files), message_thumbnail (contains thumbnails of transferred media stored as BLOBs), message_view (replay of the message table), and receipt_user (tracks when messages are received and read by other chat participants). Some of this data is seen in the message table, and some is not.  See Figures 5 through 9, respectively.

Figure 5
Figure 5.  message_location table.

Figure 6
Figure 6.  message_media table.

Figure 7
Figure 7.  message_thumbnail table.

Figure 8
Figure 8.  message_view table.

Figure 9
Figure 9.  receipt_user table.

Circling back, while the table jid does have the phone number/WhatsApp identifiers (JIDs) of other chat participants which the phone owner communicated with, it does not have their names. Just like the older version of WhatsApp, user names are contained in a separate database, wa.db. The table wa_contacts in wa.db links a user’s JID with the phone number and user name. See Figure 10.

Figure 10
Figure 10.  wa_contacts from wa.db.

So, to summarize, message content and associated data is spread across multiple tables and across two databases. In order to run a SQL query that returns data in a meaningful way, an examiner will need to attach wa.db to msgstore.db. There are two ways an examiner can approach this. The first is by using the ATTACH DATABASE function. It would look like so:

ATTACH DATABASE ‘%PATH_TO_wa.db%’ AS “whatsapp_db”;

This line should be the first line in the query so the rest of the query will function correctly. After this line, an examiner can use SELECT followed by the rest of the query.

The second way involves a native function of DB Browser for SQLite, and one I did not know about until I started working on this post (I swear I used to be an investigator). DB Browser for SQLite has an “Attach Database” button that can be used to attach wa.db to msgstore.db In order to accomplish this, first open msgstore.db, and then use the Attach Database button to attach wa.db. See Figures 11 and 12.

Figure 11
Figure 11.  The Attach Database button.

Figure 12
Figure 12.  Naming the attached database and confirmation.

The SQL query is below. Note that I have included the ATTACH DATABASE function in it. If you are using the Attach Database button in DB Browser for SQLite, simply remove the line.  Note this query returns timestamps in UTC.

ATTACH DATABASE ‘%PATH_TO_wa.db%’ AS “whatsapp_db”;
SELECT
datetime(message.timestamp/1000,’unixepoch’) AS “Message Time”,
CASE
WHEN datetime(message.received_timestamp/1000,’unixepoch’)=”1970-01-01 00:00:00″ THEN “N/A”
ELSE
datetime(message.received_timestamp/1000,’unixepoch’)
END AS “Time Message Received”,
wa_contacts.wa_name AS “Other Participant WA User Name”,
CASE
WHEN message.from_me=0 THEN “Incoming”
WHEN message.from_me=1 THEN “Outgoing”
END AS “Message Direction”,
CASE
WHEN message.message_type=0 THEN “Text”
WHEN message.message_type=1 THEN “Picture”
WHEN message.message_type=2 THEN “Audio”
WHEN message.message_type=3 THEN “Video”
WHEN message.message_type=5 THEN “Static Location”
WHEN message.message_type=7 THEN “System Message”
WHEN message.message_type=9 THEN “Document”
WHEN message.message_type=16 THEN “Live Location”
ELSE message.message_type
END AS “Message Type”,
message.text_data AS “Message”,
message_media.file_path AS “Local Path to Media”,
message_media.file_size AS “Media File Size”,
message_location.latitude AS “Shared Latitude/Starting Latitude (Live Location)”,
message_location.longitude AS “Shared Longitude/Starting Longitude (Live Location)”,
message_location.live_location_share_duration AS “Duration Live Location Shared (Seconds)”,
message_location.live_location_final_latitude AS “Final Live Latitude”,
message_location.live_location_final_longitude AS “Final Live Longitude”,
datetime(message_location.live_location_final_timestamp/1000,’unixepoch’) AS “Final Location Timestamp”
FROM
message
JOIN chat ON chat._id=message.chat_row_id
JOIN jid ON jid._id=chat.jid_row_id
LEFT JOIN message_media ON message_media.message_row_id=message._id
LEFT JOIN message_location ON message_location.message_row_id=message._id
JOIN wa_contacts ON wa_contacts.jid=jid.raw_string
WHERE message.recipient_count=0
ORDER BY “Message Time” ASC

The query will return all one-to-one party messages along with any file transfer and location sharing information and sort them chronologically oldest to newest. See Figure 13.  Note that there are many more columns returned in the results than what is shown below.

Figure 13
Figure 13.  Query results for 1:1 messaging.

Group messages work in very much the same way, but the query is slightly different as an examiner would also need the group/conversation name along with the party sending the message(s) within the group. The previously mentioned table chat contains this information. As with the previous query, this one also requires wa.db be attached to msgstore.db. The query below includes the ATTACH function, but, it can be deleted if the Attach Database button in DB Browser for SQLite is used.

ATTACH DATABASE ‘%PATH_TO_wa.db%’ AS “whatsapp_db”;
SELECT
datetime(message.timestamp/1000,’unixepoch’) AS “Message Time”,
CASE
WHEN datetime(message.received_timestamp/1000,’unixepoch’)=”1970-01-01 00:00:00″ THEN “N/A”
ELSE
datetime(message.received_timestamp/1000,’unixepoch’)
END AS “Time Message Received”,
chat.subject AS “Conversation Name”,
CASE
WHEN message.from_me=1 THEN “Self”
ELSE
wa_contacts.wa_name
END AS “Sending Party”,
CASE
WHEN message.from_me=0 THEN “Incoming”
WHEN message.from_me=1 THEN “Outgoing”
END AS “Message Direction”,
CASE
WHEN message.message_type=0 THEN “Text”
WHEN message.message_type=1 THEN “Picture”
WHEN message.message_type=2 THEN “Audio”
WHEN message.message_type=3 THEN “Video”
WHEN message.message_type=5 THEN “Static Location”
WHEN message.message_type=7 THEN “System Message”
WHEN message.message_type=9 THEN “Document”
WHEN message.message_type=16 THEN “Live Location”
ELSE message.message_type
END AS “Message Type”,
message.text_data AS “Message”,
message_media.file_path AS “Local Path to Media”,
message_media.file_size AS “Media File Size”,
message_location.latitude AS “Shared Latitude/Starting Latitude (Live Location)”,
message_location.longitude AS “Shared Longitude/Starting Longitude (Live Location)”,
message_location.live_location_share_duration AS “Duration Live Location Shared (Seconds)”,
message_location.live_location_final_latitude AS “Final Live Latitude”,
message_location.live_location_final_longitude AS “Final Live Longitude”,
datetime(message_location.live_location_final_timestamp/1000,’unixepoch’) AS “Final Location Timestamp”
FROM
message
JOIN chat ON chat._id=message.chat_row_id
LEFT JOIN jid ON jid._id=message.sender_jid_row_id
LEFT JOIN message_media ON message_media.message_row_id=message._id
LEFT JOIN message_location ON message_location.message_row_id=message._id
LEFT JOIN wa_contacts ON wa_contacts.jid=jid.raw_string
WHERE message.recipient_count>=1
ORDER BY “Message Time” ASC

See Figure 14 for the query results.  As with the previous results, many more columns are returned than what is shown.

Figure 14
Figure 14.  Query results for group messaging.

The query returns timestamps (UTC) along with the group/conversation name and the party that sent the message. The rest of the results mirror what was seen in the one-to-one query results.

Speaking of groups, wa.db keeps track of what user creates a group in the table wa_group_admin_settings. See Figure 15.

Figure 15
Figure 15.  wa_group_admin_settings table.

Pairing the group creator to the group and when the group was created requires another SQL query. The rest of the relevant information such as when a group was created and the group name are kept in another table in msgstore.db named chat_view. See Figure 16.

Figure 16
Figure 16.  chat_view table.

Another query will pull things together. Note that this query will return the JID (the WhatsApp identifier) along with the WhatsApp User Name. The reason this is so is because neither database keeps the WhatsApp user name (friendly name) of the account on the device being examined, so, in cases when the device being examined is the creator of the group, a NULL value is returned. The query is below followed by Figure 17 which contains the results.

ATTACH DATABASE ‘%PATH_TO_wa.db%’ AS “whatsapp_db”;
SELECT
datetime(chat_view.created_timestamp/1000,’unixepoch’) AS “Group Creation Time”,
chat_view.subject AS “Group Name”,
wa_group_admin_settings.creator_jid AS “Creator JID”,
wa_contacts.wa_name AS “Creator WA User Name”
FROM
chat_view
LEFT JOIN wa_group_admin_settings ON wa_group_admin_settings.jid=chat_view.raw_string_jid
JOIN jid ON jid.raw_string=wa_group_admin_settings.creator_jid
LEFT JOIN wa_contacts ON wa_contacts.jid=jid.raw_string
ORDER BY “Group Creation Time” ASC

Figure 17
Figure 17.  Query results for group information.

Call Logs

The change in the schema of msgstore.db did not have much of an affect on how call logs are stored in WhatsApp. A majority of the data needed is stored in the table call_log. See Figure 18.

Figure 18
Figure 18.  call_log table.

Note that call direction and type of call (e.g. audio or video) are described in this table in the from_me and video_call columns, respectively. For the former, a value of “1” indicates the call originates from the device being examined, and a value of “1” in the latter column indicates a video call.  Also note call duration is described in seconds.

The rest of information is contained in the jid table and the wa_contacts table on wa.db. Another query will pull things together, which is seen below. In addition to 1:1 calls (returned with a NULL in the Group column), this query will also pull calls made within groups along with the caller.

ATTACH DATABASE ‘%PATH_TO_wa.db%’ AS “whatsapp_db”;
SELECT
datetime(call_log.timestamp/1000,’unixepoch’) AS “Call Time”,
chat.subject AS “Group”,
CASE
WHEN call_log.from_me=0 THEN “Incoming”
WHEN call_log.from_me=1 THEN “Outgoing”
END AS “Call Direction”,
CASE
WHEN call_log.from_me=1 THEN “Self”
ELSE wa_contacts.wa_name
END AS “Caller”,
CASE
WHEN call_log.video_call=0 THEN “Audio”
WHEN call_log.video_call=1 THEN “Video”
END AS “Call Type”,
call_log.duration AS “Call Duration (Seconds)”
FROM
call_log
LEFT JOIN jid ON jid._id=call_log.jid_row_id
JOIN wa_contacts ON wa_contacts.jid=jid.raw_string
LEFT JOIN chat ON chat.jid_row_id=call_log.group_jid_row_id
ORDER BY “Call Time” ASC

See Figure 19 for the results.

Figure 19
Figure 19.  Query results for call logs.

Settings

WhatsApp keeps track of settings in the file com.whatsapp_preferences_light.xml which is stored in the ./shared_prefs/ folder. The settings highlighted below are some of the more common ones. There are other settings documented in this file, so keep that in mind if addional setting information is needed.  

WhatsApp displayed name (“push name”) (Figure 20)
Status message (Figure 21)
Phone number (Figure 22)
Read receipt setting status (enabled/disabled) (Figure 23)
JID (WhatsApp identifier) (Figure 24)
WhatsApp version (Figure 25)

Figure 20. WhatsApp displayed name.
Figure 21. Status message.
Figure 22. Phone number.
Figure 23. Read receipt setting status.
Figure 24. JID (WhatsApp identifier).
Figure 25. WhatsApp version.

The other thing this file tracks is the media auto-download settings, which control WhatsApp’s behavior when it receives a picture, audio file, video file, or document file. These settings are found in the Settings > Storage and data section. See Figure 26 and 27.

Figure 26. Media auto-download settings.

Figure 27. Media auto-download setting options.

The default settings for this feature is that only pictures are downloaded over mobile data, all media (pictures, audio, video, & documents) are downloaded over Wi-Fi, and nothing is downloaded when the phone is roaming. The settings are described in “masks” within com.whatsapp_preferences_light.xml. Figure 28 shows the XML tag autodownload_wifi_mask with an associated value of “15” (red box).

Figure 28. autodownload_wifi_mask.

There are two additional XML tags in the file: autodownload_cellular_mask and autodownload_roaming_mask. All three tags describe the auto-download setting of their respective namesake. Figure 27 shows that a user can choose various combinations of settings for each category (cellular, Wi-Fi, roaming). I have observed the following values for the various combinations:

0 = None
1 = Photos
2 = Audio
3 = Photos, Audio
4 = Video
5 = Photos, Video
6 = Audio, Video
7 = Photos, Audio, Video
8 = Documents
9 = Photos, Documents
10 = Audio, Documents
11 = Photos, Audio, Documents
12 = Video, Documents
13 = Photos, Videos, Documents
14 = Audio, Video, Documents
15 = All

These values apply to any of the XML tags. Knowing these values could help examiners understand if a file was automatically downloaded to a device or if it specifically required user action.

Deletions, Add-Ons, & Changes

WhatsApp has a lot going on, and there are plenty of things to explore. However, there are a few things I wanted to mention in this post. The first is message deletions. To test this, I deleted a picture that had been sent to the test phone. See Figure 29. Note that I selected the option to delete the picture from the phone, too (Figure 30).

Figure 29. Poof. It’s gone.
Figure 30. Deleting media.

I waited approximately 12 hours and then pulled the com.whatsapp folder. Figures 31 shows the results of the 1:1 message query with the message with the picture, and Figure 32 shows the results after the picture was deleted.

Figure 31. Picture message entry.
Figure 32. Picture message entry no longer present.

I was unable to find any traces of the image file in the databases, and running a raw keyword search across the database itself and the associated -wal file did not reveal the presence of the file. This may be a result of the amount of time I waited to pull the WhatsApp data. Examiners should remember that even though there may be no entries for a specific message, raw keyword searching the database and associated -wal file may find some results.

Message reactions in WhatsApp are considered “add-ons”. See Figure 33. Note that the time seen in the upper left hand corner of the screenshot is the time (UTC -0400) at which I gave the message a thumbs up reaction.

Figure 33. Message rection AKA message add on.

The value in the column message_add_on_flag for this message entry in the message table changed from “0” to “1.” See the blue box and blue arrow in Figure 34.

Figure 34. Message entry with message_add_on_flags value of “1.”

Information about the add-on is kept in the table message_add_on. See Figure 35.

Figure 35. Message add-on entry.

The value in the column parent_message_row_id (red box) corresponds to the value in column _id in the message table. The timestamp seen in the blue box in Figure 35 corresponds to the time the reaction was applied to the message, and the “1” value in the from_me column indicates my test account was the one who applied the reaction. If a second person reacts to the same message, the message_add_on flag in the message table keeps the “1” value, but an additional entry is added to the message_add_on table for the same message. See Figure 36.

Figure 36. A second reaction to the same message.

Reactions to messages are stored in the table message_add_on_reaction. See Figure 37.

Figure 37. Message reactions.

Changes in group names can be detected. I changed the name of one of the groups on the Pixel 5a. See the highlighted area in Figure 38.

Figure 38. Changed group name.

The query for Group messages can give some indication that something has changed. See the red box in Figure 39.

Figure 39. A system message with some content.

The message entry highlighted in the red box indicates a system message was received and it contains the new name of the group. Scrolling up in the query results finds an earlier entry for the group. See the red box in Figure 40.

Figure 40. Earlier system message for the group.

While the Conversation Name column in the results conains the new group name, the system message that contained the original group name is still present. Chronologically, it is the second message (a system messge) in the group. If the group name is changed a third time, the same method can be used to track back to the original name of the group. This is a great way to determine original group name and that the name has been changed.

Interestingly enough, the changes are also seen in the table message_system_value_change with the associated message row identifier (from the message table). See Figure 41.

Figure 41. Changes.

Back In The Game

Recent changes to WhatsApp have caused a few issues for mobile device examiners and the forensic tools they rely upon. The changes, while not drastic, are just enough to throw off some popular forensic tools. Fortunately, some quick SQL queries can remedy the situation.

Changes in mobile device forensic artifacts and the resultant tool failures happen frequently because things change often in our discipline, and it does take tools vendors some time to catch up. That’s just the nature of the business. When examiners understand the underlaying data and the associated structures, it helps them overcome these issues and continue on with their examination.

6 thoughts on “New msgstore – Who ‘Dis? A Look At An Updated WhatsApp On Android

  1. How did you pulled the com.whatsapp folder, is your Pixel 5a rooted?

  2. Great question. I looked in my public Android 12 image, which was running a version slightly newer than that (2.21.20), and those tags are missing from it as well. I suspect (without testing) that could be because the default settings are not changed that the applicable XML tags are not present.

  3. Helped me A LOT
    any thougts where the autodownload config is in whatspp version 2.21.4.22?
    cant find those 3 tags (roaming, wifi and cellular)

  4. Do you know how to send messages throw de new msgstore.db with sqlite???

  5. Brilliant analysis! Many thanks!
    Help me a lot in my project.

    Quick question
    I noticed msg table is updated after few minutes
    Do you know why msgs are not instantly saved there?
    Any way to force data save ?

Leave a Reply