Jan 8, 2007

The Secret Project

My secret project is finally ready to be revealed! Behold, this xkcd comic translated into a Google Maps map! When I first read that comic and the the associated blag post, I knew the idea had a lot of potential for additional geekiness. Oliver had been messing with the Google Maps API the previous few days for an unrelated project, so this immediately sprung to mind.

The details of exactly what the map is representing are explained in the blag post, but the short version is that this is a map of the IPv4 address space. Each IP address is treated as an integer from 0 to 4,294,967,295 mapped into a 16th order Hilbert curve. The advantage of using Hilbert curves is that any aligned block of 4^n addresses maps to a square and any aligned block of 2^n addresses maps to a rectangle.

To build the maps, I wrote a Python program that reads an input file that represents everything that should show up on the map and then builds the images. Each record in the file contains a name, importance, a range of zoom levels to appear at, a background color, and one or more address blocks in CIDR notation. So, for instance, a record might look like "HP,2,2,,company,,," or "Multicast,-1,-1,,metaspace,,". The file can also include color aliases, so that I can write "registrar" instead of "#f5eecc".

The program uses the Python Imaging Library to do the drawing. For each zoom level, it begins with a 16k x 16k image and draws all the blocks on that, then resizes it to the proper resolution and tiles it into 256 x 256 images for use by Google Maps. I originally wanted to use a 65k x 65k original image, because then each address maps to a unique pixel, but in 24-bit color that comes out to 12 GB of raw image data. Not entirely surprisingly, trying to allocate that much memory on a 32-bit system causes Python to crash. On the machine I run it on, it takes about half an hour to render all five zoom levels, with about 90% of that time spent downsizing the most zoomed-in level.

The data is obtained through a combination of DNS and the various whois databases, the same as the original. In looking up the address information for the organizations I put on the map, I discovered a lot of interesting bits of trivia. For instance, neither Google nor Yahoo! is assigned a block that's /16 or larger. On the other hand, despite being assigned a /8 block, IBM has many other sizable address blocks scattered all over the IPv4 address space.

Suggestions for other landmarks to include or other improvements are welcome.