COS 323 - Computing for the Physical and Social Sciences |
Spring 2005 |
Course home | Outline and lecture notes | Assignments |
Overview:
In this assignment, you will implement a simple version of "image mosaicing", which reads two images taken from the same point in different directions and produces a single image with the two "stiched together". This will require reading in the original images, constructing a function that measures how similar the images are when shifted by a given amount, optimizing the function to determine how much to shift one image relative to the other, and finally producing the output image. We will restrict the transformations that can be applied to images to be just translation (sliding) by an integer number of pixels. You should use Matlab for this assignment.
Matlab is available on several OIT machines (look here for details). You should be able run it remotely from any Unix workstation, and it is available in many clusters on campus.
Alternatively, you might be able to install it on a campus matchine - look here for details.
Read through any or all of the following for a basic introduction to Matlab
Images in the computer are represented as rectangular arrays of "pixels", each of which has a color and an intensity represented in terms of the amount of red, green, and blue that are combined there. A common representation uses a single byte (8 bits) for each color channel, meaning that each of R, G, and B can range from 0 to 255. Matlab has the capability of loading images available in many file-types such as JPG. Once a color image is converted to grayscale, it can be treated as a matrix and all of Matlab's operations can be applied to it.
Work through the following tasks using an image of your choice. You do not need to submit any results, but make sure you are comfortable doing the following.
Note: if your version of Matlab doesn't have the rgb2gray function, download rgb2gray.m. Place this in your working directory, and it should be auto-loaded by Matlab.
matrix2 = matrix1(row_min:row_max,col_min:col_max);Indices in Matlab are 1-based (not 0-based as in C).
[var1, var2] = func(x)Hint #3: In Matlab, the number of rows is the first dimension and the number of columns is the second.
If you get stuck on any of these, feel free to ask for help, either by emailing smr@princeton.edu or by asking a colleague. (Just to be clear, working together to learn Matlab is encouraged, but collaboration on the rest of the assignment is not allowed.)
Ultimately, we will be doing an optimization to find out how much one image should be moved relative to the other. To do this, we will define an objective function that measures the mean squared difference of pixel values between image 1 and the translated version of image 2, in the region where the images overlap. Minimizing this function will find the translation that makes the images as similar as possible.
Write a Matlab function that takes as input:
Test your alogrithm on some real and synthetic images, passing in candidate translations by hand and making sure the results are reasonable.
To start out, assume that the images differ only by translation in x. Implement a 1-dimensional optimization based on Golden Section search. Compared to the basic algorithm, there are two wrinkles to be aware of:
Now, assume that the images may be translated in both x and y. You will find the optimal alignment between them using the "Taxi Cab" or successive relaxation method. In this method, you first do a 1D minimization in x while holding y constant, then minimize in y, and repeat until convergence. While this is not a particularly great method in general (it is particularly prone to ``ziz-zagging'' along valleys), it has the advantage of being simple to implement (you can reuse your 1D minimizer from above) and will work well for images with primarily horizontal and vertical features. For extra credit, read up on Powell's method in Numerical Recipes, and implement it (this will also require the ability to do 1D optimization in a ``diagonal'' direction that isn't just x or y).
Once you know the optimal alignment between image 1 and image 2, write a function that produces an output image and writes it out (using imwrite). Assuming the initial images are both of size (sx,sy) and the optimal translation was (dx,dy), the output image will have dimensions (sx+|dx|,sy+|dy|). The pixels should be taken as follows:
Several test images will be provided here in the near future, or you are welcome to use your own (hint: be careful not to tilt the camera between images, and images made with telephoto lenses will probably align better). Meanwhile, here are a few pictures to get you started. These were created by extracting sections from a larger image, so they should align almost perfectly:
This assignment is due Thursday, February 24 at 11:59 PM. Please see the general notes on submitting your assignments, as well as the late policy and the collaboration policy.
Please submit: