String & Metadata Analysis

Blog Details

Home
Blog
String & Metadata Analysis

Posted by: Eng. Donya Bino
February 03, 2026

String & Metadata Analysis

In cyber security, string analysis refers to the process of retrieving human-readable text from a binary file, sample of malware or from network traffic and analyzing that content (strings) to identify various clues, such as URLs, IP addresses, file paths, command format, errors etc., which may assist you in determining whether there was a malicious intent.

Metadata analysis focuses on the "data about data" that is attached to files, digital artifacts, etc., such as creation date, author, historical access, file size, and geographic location. By using this method you can locate hidden information and do it without actually having to read the full content data.

The two methods are therefore cornerstones in the field of digital forensics, malware reverse engineering and threat-hunting. They provide the basis on which a timeline can be constructed, the origin of evidence can be traced and anomalies can be identified.

Importance of String and Metadata Analysis in the Real World:
For businesses, they help to expose insider threat actors and/or those who have made unauthorised changes to documents or files. In terms of safety, they also help identify emails that are spoofed or are phishing attempts. For the purposes of governance, they help to ensure compliance by tracking data handling.

Examples of how strings and metadata play a role in the analysis of cyber attacker activity are, strings are often the means by which indicators of compromise (IOCs) are identified, such as identifying C2 servers; and the types of metadata that would be useful to identify related to a data breach include timestamps of unauthorized access to a system.

If string and metadata analysis are not performed, then any of the evidence or traces of malicious activity associated with an attack or leak could go undetected and lead to a significant data leak or unrecognized attack.

Tools for Analyzing Strings and Meta Data
The following free or low-cost tools are available from novice to expert level and can be used by anyone.
1. Strings (Linux, macOS are built-in; Windows is Sysinternals). Extracts printable strings from binary files. You can use this tool to determine indicators of compromise from suspicious files.
Example command (Linux): strings suspicious.exe | grep -i "http" # Shows a list of URLs.

2. ExifTool (Free, Any OS). Reads/writes metadata (data stored with an image, video or pdf file) from/to various file types. You can use this tool to check the EXIF data in a digital image file, to see if a photo has GPS data attached or to find out what camera was used to take a photo.
Example command: exiftool photo.jpg
# Return creation date/time, camera make/model, location etc.

3. Wireshark (Free, Any OS). Captures/Analyzes Network Traffic (packet data). You can use this tool to search for http packets in wireshark captures.
Example: filter http contains "password" will provide evidence of leaked credentials.

4. Binwalk (Free, Linux or macOS). Examines binary/executable files for embedded files or strings. You can use this tool to extract hidden data from firmware images/files.
Example command: binwalk -e firmware.bin
# Extracts any strings and files from the specified firmware image/file.

5. YARA (Free, Any OS). Scans Files & provides analysis of files for patterns or string matches. The purpose of this tool is to identify/extract malware based on known strings. Example of Rule:
rule SampleMalware {
strings:
$s1 = "malicious_command" ascii
condition:
$s1
}
How to Run: yara rule.yar suspect.file

Concrete Evidence with Real Data
Public Reports of Cyber Security Studies and Investigations are the basis for this example:
1. An Investigation into Phishing by Email Header Metadata has occurred in 2025 at Fidelis Security. In this investigation, investigators evaluated Email Metadata (i.e., Sender Name, Email Address, Timestamp, Sending IP Address, and Routing Path) to determine that the Email Sender had a Sending IP Address containing an email address from outside the United States. The Receiving Email Headers showed that the Timestamp of the Reception of the Emails did not match the Timezone of the Sender's IP Address and contained an 'X-Mailer' string that showed it was sent using a 'Bulk Mailer Tool'. Therefore, they blocked all emails from that Domain in order to prevent the Credential Theft from Users.

2. Insiders Threatening Systems of File Metadata
CrowdStrike conducted an inquiry into the access frequency of unauthorized users (for instance, by examining how often the file was modified at 3:00 a.m.) and examined whether there were user ID mismatches. Additionally, the inquiry revealed timestamps showing the files may have been copied to a location off-site based on their presence in the $MFT of an NTFS volume.
This investigation confirmed the identity of the suspect who was stealing trade secrets from his employer.

3. Using String Extraction for Malware Inspection
According to Veeam, a ransomware case in 2025 resulted in extracting strings that represent file names (klospad.pdb and keme132.dll) with the use of the Strings tool. These strings included reference to file locations as well as commands pertaining to the exploit of CTBLocker Ransomware. Future use of YARA rules for identifying malware would then be built from the generation of C2 URL strings within their codes.

4. Document Metadata Used in Espionage In a fake-documentation trial in 2014, metadata within documents released to the public from a Russian source, which also indicated both the authors were Russian and that they used Cyrillic settings Do show that they did not originate in Ukraine. For instance, ExifTool output showed the last modified by fields and keyboard language codes for the documents in question. A separate analysis of similar documents for the phishing scam involving fake documents released into the public showed the same metadata analysis could be performed on 2025, the latest date recorded as of October 2023.

5. Network Metadata Used in Investigation of Copyright Violations In 2019, 1 Capital One (via Vectra AI) suffered a hash collision breach of cloud instance metadata (IMDS) via SSRF. Examples of metadata exposed include the IAM roles and temporary keys of users attempting to access the AWS credentials. A total of 106 million customer records were ultimately affected because of these exploits. The metadata logs demonstrated that multiple incorrect queries were made on the metadata showing the how and when of the digital exploitation for monetary gain occurred.

Things you can do to safeguard your data
1. Download ExifTool and test it on your photos and documents to see how much information is recorded in the metadata of your images.
2. To find suspicious EXE files, use Strings and search the files using the string “http,” “password,” or “cmd.”
3. Use Wireshark (a free tool) to analyze suspicious activity on your network by capturing traffic. Once captured, right-click on the packet and select Follow > HTTP Stream to analyze the contents of the packet.
4. Delete the metadata from files before you share them with others (ExifTool -all= file.jpg). Always use applications with a strong privacy focus.
5. Create YARA rules to identify files containing certain string patterns (this is a free tool).
6. If you believe you have been compromised or hacked, isolate the device, run your antivirus application, and look through the web logs (Windows Event Viewer / Linux journalctl).

Key Takeaways
The analysis of strings and metadata can be a very valuable process in discovering hidden information within files, files transferring across a network, or in physical evidence left behind. The tools mentioned in this article, ExifTool, Strings, and Wireshark, are all free and can be utilized by those new to the field of forensics analysis of the information recovered from malware, phishing attacks, and breaches.

Real-world examples exist in which ExifTool, Strings, and Wireshark were used to identify internal threats, counterfeit documents, and eventually, indicators of compromise (IOCs) resulting from Ransomware.

By putting in some practice on your own files now, you can develop your skills and gain an appreciation for the capabilities of each piece of software; it may take just one scan of a file to give you a new perspective on what that file may hold!

Blog Details

String & Metadata Analysis

Recent Post

Random Posts