Tuesday, May 15, 2018

welcome back...to war! "Not that I'm currently cruising for jobs with British intelligence or anything, but I happened upon (via Hacker News) this current coding challenge posted to the MI5 careers page...."
Prerequisites: Assuming you've already downloaded and installed Python, you should do two things. One: spend 10 minutes doing this "Hello, World" Python for non-programmers tutorial. Two: spend another five minutes doing this tutorial on using Python modules

0.0) Install Pillow

The active version of PIL is actually known as Pillow, so this is what we need to install. You should do this with the Python package manager pip, which is covered in the second prerequisite tutorial above. Just:
pip install pillow
Now, create a new Python script in whatever text editor you like. I'm using Sublime Text, which is great. I called my script metaread.py.

1.0) Create an Image object

First thing we're going to do is actually bring in the Pillow module we installed, which is the first line below. Next, we need to create an object representation of our MI5 image, puzzle.png. This exposes the image and all of the things we can do with it via the Pillow module to our Python script. To see some more of these capabilities, check out Hack This: Edit an Image in Python.
from PIL import Image image = Image.open("water.png")

2.0) Extract the Exif data

Not all image formats contain Exif data. Mostly just JPGs. Which is fine because that's most pictures. The MI5's image is actually a .PNG file, which we'll have to handle somewhat differently. Let's do a quick JPG though.
There's really nothing to it. I create the image object as above then call the _getexif()function on it. In return, I get a dictionary data structure full of metadata.
The dictionary consists of tag-value pairs, which we can extract and view using a for-loop, like this. Note that I had to import some extra stuff at the top:
from PIL import Image from PIL.ExifTags import TAGS, GPSTAGS image = Image.open("gpsample.jpg") print(image) info = image._getexif() for tag, value in info.items(): key = TAGS.get(tag, tag) print(key + " " + str(value))
So, that just outputs all of the Exif data contained within a given image as a series of entries. It's hardly guaranteed to be the same for every image. I had to search online for a sample image containing GPS metadata because I got tired of scanning through everything on my computer trying to find an example (though it wouldn't be too hard to write a script that could comb through a file of images and automatically pull out those that do include it). In any case, you can find the same image here.
A sampling of the output:
GPSInfo {0: '\x00\x00\x02\x02', 1: u'S', 2: ((33, 1), (51, 1), (2191, 100)), 3: u'E', 4: ((151, 1), (13, 1), (1173, 100)), 5: '\x00', 6: (0, 1)} ISOSpeedRatings 100 ResolutionUnit 2 WhiteBalance 0 GainControl 0 BrightnessValue (100, 10)

2.1) Extract non-Exif data

Again, PNGs don't come with Exif data.
Don't panic. Just because it's not in Exif format doesn't mean that puzzle.png's metadata is all that more difficult to access.
It so happens that when an image is loaded per step 1.0, the PIL module will automatically load up a dictionary with whatever metadata it can id. We can barf it all out to the screen with a simple print statement:
print (image.info)
Or we can loop through it as in 2.0 as such:
for tag, value in info.items(): key = TAGS.get(tag, tag) print(key + " " + str(value))
Problem solved?
So, at this point I need to confess that this .info method is not actually returning all of the metadata from puzzle.png, and I don't quite know why. In addition to regular old Photoshop and the ExifRead Python tool mentioned above, I also tried four different online metadata extraction tools and only one was able to return a complete listing: Jeffrey Friedl's Image Metadata Viewer. Said viewer is based on a command-line tool called ExifTool, which I downloaded and ran. It too worked.
But I promised Python and Python we shall write. It's actually pretty easy to run a command-line program from within Python, but you'll still have to download the actual command line program, which is available here. Now, we can run this script on our image file, and the ExifTool will output the result via Python to the screen. Try it.
import os os.system('exiftool -h puzzle.png')
See the clue?
I don't know why it was so difficult to pull metadata from this file. It may have something to do with how metadata in PNG files is laid out. Within the file, metadata is kept in data structures called chunks. Chunks are given weird coded names that define, among other things, whether they should be considered "critical" or not. Critical chunks include actual image data, bit depth, and color palette. Not-critical chunks offer histograms, gamma values, default background colors, and, finally, text. There are three different types of text chunks all with a standard dictionary entry format. Each text entry has a name or title, and then some associated text. They can be user-defined, but there are some text field types that come predefined, such as "comment." Which in our MI5 file contains this:
https://motherboard.vice.com/en_us/article/aekn58/hack-this-extra-image-metadata-using-python
What secrets are your JPGs hiding?
MOTHERBOARD.VICE.COM

No comments:

Spain Intel (queen Letizia naked)