Github Repo: https://github.com/ChrisTruncer/Just-Metadata
For some time now, I’ve been working on a tool which aggregates data about IP addresses from publicly available sources. Three separate events prompted this project. First, I began noticing a large number of IP addresses attempting to brute force their way into my mail server. Second, a large number of systems/IPs scanned my web server for vulnerable web applications (Tomcat, phpMyAdmin, etc). Finally, ATD sometimes will receive spam email that contains malware. Justin Warner (@sixdub), ATD’s resident reverse engineer, investigated one of the malware sample in a spam message we received and was able to extract the IP addresses of the callback domain.
I wanted to see if there was anything I could learn about the systems/IPs targeting my server and the malware callback domains we were seeing. Specifically, I wanted to collect the following:
- IP Whois Information
- Geographical Information
- Shodan information (Ports, keys, certificates, etc.)
- Various Threat Feeds
After a couple conversations with Justin, I decided to write a tool to do just that. Justin and I brainstormed functionality that would be useful, and the type of information we would want to gather. However, just simply gathering the information isn’t necessarily enough to provide any sort of value. It’s the analysis of the available data where I can get something useful. Are the systems that are scanning me owned by the same person/company/etc.? Are they located in the same country? To answer these questions, I wrote Just-Metadata, which I am happy to release today.
Let’s walk through some of the features, and how Just-Metadata works.
To start off using Just-Metadata, create a text file containing a list of IP addresses (each on a new line). To get the IPs into the Just-Metadata, you’ll use the load command, and provide the path to the file containing the IP addresses, similar to either of the following:
load ips.txt – If ips.txt is in the same directory as Just-Metadata
load /home/SonofFlynn/iplist.txt – Full path to file is also accepted
You should soon see a message indicating that X number of IPs have been loaded.
Now that the IPs we want to investigate have been loaded into the framework, we can begin the information gathering process. Just-Metadata can connect to and gather information from a variety of different sources. To see the different sources that Just-Metadata grabs from, simply run the command “list gather”, and you should see something similar to the following:
The information left of the “=>” is the module name, and the information to the right gives a description of what it gathers. To actually collect information from each source, you would use the “gather” command. As an example, to grab information from Shodan about the loaded IPs, I would run “gather shodan”. I would then see Just-Metadata querying Shodan (or the selected source) for information. NOTE: Shodan is the only source within Just-Metadata that requires an API key. Be sure to place your API key into the shodan module. To do this, open the shodan module (located at Just-Metadata/module/intelgathering/get_shodan.py and add your API key in line 16, for the self.api_key variable. All other modules work without any requirement other than an internet connection.
I will typically follow this process for all available intelligence gathering sources within Just-Metadata. One item to note is that some sources are rate-limited, so certain intelgathering modules can take time to complete depending upon the number of IPs that need to be investigated. Once all of the different intelgathering sources have completed gathering information, the meat of the tool can be put to use, the analysis modules.
Just-Metadata can be used to perform automated analysis against the gathered data. The goal of these modules should be to analyze patterns across all of the gathered data, and attempt to find, and display, meaningful data for consumption by the user. This is what I consider to be the most valuable part of the tool. Collecting data is important, but extrapolating meaning from it is where the actual good stuff lies. Analysis modules have full access to all data collected by the framework, and should be used to identify patterns in data, or anything that can be used to find useful data in an otherwise large dataset. To start off, you can list all of the different analysis modules by using the “list analysis” command. When running this, you should see something similar to the following:
To use any of the available analysis modules, just use the “analyze” command. For example, if we want to view the top X cities, countries, timezones, ISPs, etc. that are included within the loaded IP addresses, you can use the “Geo” module. To run this module, just type “analyze geoinfo”. You will be prompted for the number of results you want back per category, I usually choose 10, but feel free to adjust as you see fit. Once you provide the total number of results per category, you should see something similar to the following (amongst additional data):
Another module that I find useful is the “keys” module. This module works by parsing the data from Shodan, and looks for shared SSH keys, or HTTPS certificates across all IPs loaded into Just-Metadata. The keys module will then ask the number of keys you want displayed, and will show any SSH keys, HTTPS certificates, etc. that is shared across any of the loaded IP addresses. When running “analyze keys” you might see something similar to the following:
In this limited number of IPs, there weren’t any systems that shared the same SSH keys, however there are three different IPs that have a shared https certificate.
Another analysis option is the “feedhits” module. This will compare all of the IPs loaded into the framework with a variety of different threat intel feeds that are available on the internet. If an IP is in any of the threat intel feeds, it will highlight and call them out. When comparing the loaded IPs to different threat intel feeds, these were some of my results:
If at any point you would like to see all the information gathered about a single IP address, you can do that by using the “ip_info” command along with the IP address. So the command may look like “ip_info 22.214.171.124” and your output might be similar to the following:
Another feature that I wanted in this tool is the ability to save the current state, and then reload it for analysis later. Just-Metadata can do this with the “save” and “import” functions. To save your current state (after you’ve gathered any data you wish to have saved along with the state), simply type “save” at the command prompt. You should see something similar to:
You can now safely exit Just-Metadata without losing any of the data you’ve gathered.
To import any state that’s been saved to disk, just use the import command, along with the path to a Just-Metadata state file. Your command could be something like “import metadata06082015_172032.state” and once run, should look similar to:
I believe that this covers most of the functionality in Just-Metadata. If there’s anything that’s missing that would be helpful to have explained, let me know and I’ll be sure to add to this post. I’m available at @christruncer on Twitter or in #veil on Freenode!