SPE '02 -- Image Mosaics -- the Process and Program

the Process and Program

I got off to a slow start, spending the first week trying to figure out OpenGL and my advisor's header files. As practice, I modified the COS 126 Mandelbrot assignment to produce the image directly, instead of through postscript output.

At the same time, I read through online material about other mosaic programs. I already had a basic notion of what I needed to do, so I wrote a function for dividing a loaded target image into square tiles, subdivide(). The next logical action was to calculate R, G, and B values for each tile. In my first algorithm (for an explanation of first vs. second, see Results), I used get_pixel() from the Image header to store the average R, G, and B values for each tile in a two-dimensional stack through pixelate(). This stack utilized an array occupying 3 bytes per tile. My second algorithm discarded the stack and simply created a two-dimensional array using (16 * 3 bytes) per tile.

the Image Library

At about this time I discovered the "cheat" facet of Range's program. I decided I did not want to use luminosity adjustment or overlay to make the overall image produced by the mosaic more obvious to the viewer. I wanted it to be straightforward and direct. The best way to do this it seemed, was to divide the target into a large number of tiles. In my reading I discovered that many (if not most) accurate mosaics consisted of at least 1000 tile images, so I added that restriction to subdivide(). I looked around online for image libraries. Further reading impressed the importance of quantity of tile images as well as detail. I settled on a library of 3602 LP and CD covers, downloaded from a webpage promoting the MAZAIKA mosaic-making software. These I converted, using perl script and the "convert" program provided by my advisor, from .jpg to 24x24 .ppm files.

I wrote another short piece of perl script to print the pathname for every single tile image to a .txt file. This .txt file was used in the library program, the output of which is used by the mosaic making program. The library program loaded each tile image, calculated its average R, G, and B values, and printed these to another .txt file as well as the image's pathname.

The mosaic program reads in the .txt file outputted by the library program and loads RGB information and image filename into an array, by set_array(). As you can imagine, this is a large array. Using my first algorithm (see Results), this array was 3 unsigned chars + 1 string of 35 characters per tile image. That's ((3 * 1 byte) + (35 * 1 byte)) * (3602 images) or 136876 bytes. The second algorithm resulted in an array allocation of 298966 bytes. (Again, for an explanation, see results.)

After all this information processing, it came time to actually compare tiles and tile images. The simplest method would be to find absolute differences between R, G, and B values and choose the smallest difference (or color distance). To consolidate into one color distance, the tried and true Euclidean method involving least squares is a likely option. However, human sight is prone to "imperfections" not easily describable by such simple math. Therefore, the colors in a tile image chosen by Euclid's formula would not necessarily appear to the human eye as the closest match for a color in a target image. The modification used by many mosaic programs (and my own) is described by the CompuPhase Colour Metric.

To store these comparisons, I set up another array, through compare(), of structs I called MATCH, which stored the "number" of the tile in question, the tile image's filename, and the Quality of Fit (QoF) as determined by the adjusted Euclidean formula. This array is even larger than the library array, being ((((1 * 4 bytes) + (1 * 8 bytes) + (35 * 1 byte)) * 3602 images) * num_tiles) or (169294 bytes * num_tiles), where num_tiles is the number of tiles (greater than or equal to 1000) determined by subdivide() earlier in the program. A modified version of Quicksort then sorts every 3602 elements of this array by QoF.

The function choose_best() then cut this array down to "num_tiles" elements. The first version of choose_best() left them as MATCH types, whereas the second strictly left strings (the images' filenames). Then the mosaic was created and displayed by lay_tiles(), which would call get_pixel() and create a two dimensional stack for each tile image and then pop the stack for each color necessary to set_pixel().

Return to "Title" page

Code Snippets | Sources | email me