Create a GIF from an IP Image History

Shodan keeps a full history of all the information that has been gathered on an IP address. With the API, you're able to retrieve the services seen the past quarter and we're going to use that to create a tool that outputs timelapse GIFs. Here is an example of what we're going to produce:

Requirements

The following Python packages are required for this tutorial:

arrow
shodan

The arrow package is used to parse the timestamp field of the banner into a Python datetime object.

In addition to the above Python packages, you also need to have the ImageMagick software installed. If you're working on Ubuntu or another distro using apt you can run the following command:

$ sudo apt-get install imagemagick

This will provide us with the convert command which is needed to merge several images into an animated GIF.

There are a few key Shodan methods/ parameters that make the script work:

shodan.helpers.iterate_files() allows us to iterate over all the banners in a list of Shodan data files
Shodan.host() is used to lookup information about an IP, including its history when the history=True parameter is set
shodan.helpers.get_ip and shodan.helpers.get_screenshot to make the code work across IPv4 and IPv6 as well as make it more robust for future banner changes

Step 1: Setting up the basics

Lets start off by importing all of the modules we'll be needing and initializing the Shodan API object which we use to communicate with the API. Save the following content in a file called gifcreator.py:

import arrow
import os
import shodan
import shodan.helpers as helpers
import sys

# Settings
API_KEY = '' # Enter your API key here

# The user has to provide at least 1 Shodan data file
if len(sys.argv) < 2:
    print('Usage: {} <shodan-data.json.gz> ...'.format(sys.argv[0]))
    sys.exit(1)

# GIFs are stored in the local "data" directory
os.mkdir('data')

# Setup the Shodan API object
api = shodan.Shodan(API_KEY)

We've imported the Python packages that are needed, created an area for various program settings and are performing simple input validation to make the script more user-friendly. Try to run the script to make sure there aren't any errors:

$ python gifcreator.py testdata

If the program raises an error make sure you didn't introduce additional whitespace or other artifacts. The script should run without exceptions and not show anything on the terminal.

Step 2: Looking Up Historical IP Data

Now that we have the basic skeleton of the script established lets get into the actual image collection aspect. The script accepts a list of Shodan data files which may contain services with screenshots. We need to iterate through those data files, check whether there are any images and if an IP has an image then we want to get all historical images for that IP.

First, we will be using the shodan.helpers.iterate_files() method to help use iterate over the Shodan data files. It takes care of uncompressing and deserializing the banners so we can immediately work with a Python object:

# Loop over all of the Shodan data files the user provided
for banner in helpers.iterate_files(sys.argv[1:]):
    # See whether the current banner has a screenshot, if it does then lets lookup
    # more information about this IP
    has_screenshot = helpers.get_screenshot(banner)
    if has_screenshot:
        ip = helpers.get_ip(banner)
        print('Looking up {}'.format(ip))
        host = api.host(ip, history=True)

        ## Step 3 here

We're using the shodan.helpers.get_screenshot() method to extract the image data from the banner. If the banner doesn't have an image/ screenshot then it will return None. And the shodan.helpers.get_ip() method is used to get the IPv4 or IPv6 address of the host. The final and most important part of the new code is at the end: shodan.Shodan.host(). This method looks up information in Shodan for a specific IP address (accepts both IPv4 or IPv6) and returns an object with all the recently-seen ports/ services. As an optional parameter you can add set history=True which makes the method return not just the recently-seen information but all the historical data as well. The historical data is what lets us create a timelapse.

Step 3: Extracting Images

Once we have the historical data we need to look through it and grab all the available images for our timelapse GIF. And for it to act as a timelapse we need to sort the images based on the time of day that the banner was collected:

# Store all the historical screenshots for this IP
screenshots = []
for tmp_banner in host['data']:
    # Try to extract the image from the banner data
    screenshot = helpers.get_screenshot(tmp_banner)
    if screenshot:
        # Sort the images by the time they were collected so the GIF will loop
        # based on the local time regardless of which day the banner was taken.
        timestamp = arrow.get(banner['timestamp']).time()
        sort_key = timestamp.hour

        # Add the screenshot to the list of screenshots which we'll use to create the timelapse
        screenshots.append((
            sort_key,
            screenshot['data']
        ))

The Shodan.host() method returns an object where the data property contains a list of all the banners, including the historical ones. The above code loops through those banners, grabs any possible screenshot using the shodan.helpers.get_screenshot() method and then stores it in a list. Each item in the list is a tuple with 2 values:

sort_key: the key that will be used to sort the screenshots by time of day; we're using the hour that the banner was collected but you could also create a compound key using both the hour and minute.
data: the actual screenshot data (base64-encoded)

We're using the arrow Python library to parse the timestamp string on the banner into a time object which we then use as the sorting key.

Step 4: Creating the GIF

The final step is to dump all the images into individual files and then use the ImageMagick convert command to merge all the images into a GIF. First, lets dump the screenshots into JPG files:

# Only generate a GIF if there have been 3 or more screenshots in the history
if len(screenshots) >= 3:
    # screenshots is a list where each item is a tuple of:
    # (sort key, screenshot in base64 encoding)
    # 
    # Lets sort that list based on the sort key and then use Python's enumerate
    # to generate sequential numbers for the temporary image filenames
    for (i, screenshot) in enumerate(sorted(screenshots, key=lambda x: x[0], reverse=True)):
        # Create a temporary image file
        # TODO: don't assume that all images are "jpg", use the mimetype instead
        open('/tmp/gif-image-{}.jpg'.format(i), 'w').write(screenshot[1].decode('base64'))

enumerate() is a built-in Python method that accepts a list and returns a tuple of (index, value) where index is the position of the value in the list; it's an easy way to create a sequential number. And we're supplying the sorted() method a custom sorting function so the screenshots will be sorted by our sort_key in reverse order.

For an IP with 3 screenshots the above code would generate the following files:

/tmp/gif-image-0.jpg
/tmp/gif-image-1.jpg
/tmp/gif-image-2.jpg

Now that we have the images it's time to generate the actual GIF. The speed of the GIF depends on the delay parameter:

    # Create the actual GIF using the ImageMagick "convert" command
    # The resulting GIFs are stored in the local data/ directory
    os.system('convert -layers OptimizePlus -delay 5x10 /tmp/gif-image-*.jpg -loop 0 +dither -colors 256 -depth 8 data/{}.gif'.format(ip))

    # Clean up the temporary files
    os.system('rm -f /tmp/gif-image-*.jpg')

    # Show a progress indicator
    print('GIF created for {}'.format(ip))

All the hard work has been done - the above code simply calls the convert command to merge all the images into a timelapse GIF and then deletes the temporary files. The resulting GIF is stored in the local data/ directory which we created at the start of the script.

Putting Everything Together

Here is what the complete script looks like:

#!/usr/bin/env python
# gifcreator.py
#
# Dependencies:
# - arrow
# - shodan
#
# Installation:
# sudo easy_install arrow shodan
# sudo apt-get install imagemagick
#
# Usage:
# 1. Download a json.gz file using the website or the Shodan command-line tool (https://cli.shodan.io).
#    For example:
#        shodan download --limit=100 screenshots.json.gz has_screenshot:true
# 2. Run the tool on the file:
#        python gifcreator.py screenshots.json.gz

import arrow
import os
import shodan
import shodan.helpers as helpers
import sys


# Settings
API_KEY = ''

# The user has to provide at least 1 Shodan data file
if len(sys.argv) < 2:
    print('Usage: {} <shodan-data.json.gz> ...'.format(sys.argv[0]))
    sys.exit(1)

# GIFs are stored in the local "data" directory
os.mkdir('data')

# Setup the Shodan API object
api = shodan.Shodan(API_KEY)

# Loop over all of the Shodan data files the user provided
for banner in helpers.iterate_files(sys.argv[1:]):
    # See whether the current banner has a screenshot, if it does then lets lookup
    # more information about this IP
    has_screenshot = helpers.get_screenshot(banner)
    if has_screenshot:
        ip = helpers.get_ip(banner)
        print('Looking up {}'.format(ip))
        host = api.host(ip, history=True)
        
        # Store all the historical screenshots for this IP
        screenshots = []
        for tmp_banner in host['data']:
            # Try to extract the image from the banner data
            screenshot = helpers.get_screenshot(tmp_banner)
            if screenshot:
                # Sort the images by the time they were collected so the GIF will loop
                # based on the local time regardless of which day the banner was taken.
                timestamp = arrow.get(banner['timestamp']).time()
                sort_key = timestamp.hour

                # Add the screenshot to the list of screenshots which we'll use to create the timelapse
                screenshots.append((
                    sort_key,
                    screenshot['data']
                ))

        # Extract the screenshots and turn them into a GIF if we've got more than a few images
        if len(screenshots) >= 3:
            # screenshots is a list where each item is a tuple of:
            # (sort key, screenshot in base64 encoding)
            # 
            # Lets sort that list based on the sort key and then use Python's enumerate
            # to generate sequential numbers for the temporary image filenames
            for (i, screenshot) in enumerate(sorted(screenshots, key=lambda x: x[0], reverse=True)):
                # Create a temporary image file
                # TODO: don't assume that all images are "jpg", use the mimetype instead
                open('/tmp/gif-image-{}.jpg'.format(i), 'w').write(screenshot[1].decode('base64'))
            
            # Create the actual GIF using the  ImageMagick "convert" command
            # The resulting GIFs are stored in the local data/ directory
            os.system('convert -layers OptimizePlus -delay 5x10 /tmp/gif-image-*.jpg -loop 0 +dither -colors 256 -depth 8 data/{}.gif'.format(ip))

            # Clean up the temporary files
            os.system('rm -f /tmp/gif-image-*.jpg')

            # Show a progress indicator
            print('GIF created for {}'.format(ip))

Lets give it a shot and see how it runs:

The full code is also available on GitHub: https://gist.github.com/achillean/7190d53841865a8c9978f59863669857