COS 323 - Computing for the Physical and Social Sciences

Spring 2006

Course home Outline and lecture notes Assignments


Assignment 1

Due Tuesday, Feb. 28

Overview:

In this assignment, you will implement a simple version of "image mosaicing", which reads two images taken from the same point in different directions and produces a single image with the two "stiched together". This will require reading in the original images, constructing a function that measures how similar the images are when shifted by a given amount, optimizing the function to determine how much to shift one image relative to the other, and finally producing the output image. We will restrict the transformations that can be applied to images to be just translation (sliding) and rotation. You should use Matlab for this assignment.


0. Getting familiar with Matlab

Matlab is available on several OIT machines (look here for details). You should be able run it remotely from any Unix workstation, and it is available in many clusters on campus.

Alternatively, you might be able to install it on a campus matchine - look here for details.

Read through any or all of the following for a basic introduction to Matlab

Images in the computer are represented as rectangular arrays of "pixels", each of which has a color and an intensity represented in terms of the amount of red, green, and blue that are combined there. A common representation uses a single byte (8 bits) for each color channel, meaning that each of R, G, and B can range from 0 to 255. Matlab has the capability of loading images available in many file-types such as JPG. Once a color image is converted to grayscale, it can be treated as a matrix and all of Matlab's operations can be applied to it.

Work through the following tasks using an image of your choice. You do not need to submit any results, but make sure you are comfortable doing the following.

  1. Read an image into a variable.
    Hint #1: "help imread"
    Hint #2: Use single quotes around the filename.
    Hint #3: Ending a command with a semicolon supresses printing the result.

  2. Display the image.
    Hint: "help image"

  3. Adjust the axes so that the aspect ratio is 1:1 (i.e., "square pixels").
    Hint: "help axis"

  4. Convert the image to grayscale.
    Hint: "help rgb2gray" Note: if your version of Matlab doesn't have the rgb2gray function, download rgb2gray.m. Place this in your working directory, and it should be auto-loaded by Matlab.
  5. Convert the grayscale image to floating point.
    Hint: "help double"

  6. Display the floating-point image, using a grayscale colormap.
    Hint #1: "colormap(gray)"
    Hint #2: "help imagesc"

  7. Plot the intensities along one row of the image.
    Hint #1: Extracting a part of a matrix is done by
    	matrix2 = matrix1(row_min:row_max,col_min:col_max);
    
    Indices in Matlab are 1-based (not 0-based as in C).
    row_max or col_max may be "end" to indicate the last element.
    Just a ":" is equivalent to "1:end".
    Hint #2: "help plot"

  8. Store the width and height of the image in variables "width" and "height".
    Hint #1: "help size"
    Hint #2: Functions in Matlab may return multiple values. You can get at the values using the notation
    	[var1, var2] = func(x)
    
    Hint #3: In Matlab, the number of rows is the first dimension and the number of columns is the second.

  9. Write a pair of nested "for" loops to set a grid of every 10th pixel horizontally and every 20th pixel vertically to 0.
    Hint #1: "help for"
    Hint #2: "start:increment:stop"

  10. Create a function "maxrow" that takes a matrix and a row index and returns the brightest pixel in the given row. Store the function in a file called "maxrow.m" so that Matlab loads it automatically when you call the function.
    Hint #1: "help function"
    Hint #2: Matlab has many built-in functions that operate on entire vectors or matrices. There might be one to compute the maximum...

  11. Write the modified image back to a new file.
    Hint #1: "help uint8" to convert the image back to integer
    Hint #2: "help imwrite"

If you get stuck on any of these, feel free to ask for help, either by emailing cos323@cs.princeton.edu or by asking a colleague. (Just to be clear, working together to learn Matlab is encouraged, but collaboration on the rest of the assignment is not allowed.)


1. Image Difference

Ultimately, we will be doing an optimization to find out how much one image should be moved relative to the other. To do this, we will define an objective function that measures the mean squared difference of pixel values between image 1 and the transformed version of image 2, in the region where the images overlap. Minimizing this function will find the transformation that makes the images as similar as possible. We will consider only non-reflective rigid body transformations. These can be written as a rotation about a point, followed by a translation. For convenience, we will consider rotations about the center of the image.

Write a Matlab function that takes as input:

and produces as output: To avoid unneccesary work, we will use the Matlab function imtransform to perform image transformations. This function requires a tform structure to define the transformation to be performed. We include getTransform.m, which produces a tform corresponding to a rotation about the center of an image followed by vertical and horizontal translations. Since the dimensions of the image may change, imtransform optionally takes in the X and Y bounds of the new image. Here, we wish to crop the resulting image to the size of the original. This is done in imtransformSimple.m, which should be sufficient for this part. Those parts of the new image that are unknown are filled with black (0).

We wish to only consider those areas where the two images intersect. Finding these regions from scratch is tedious, so it may be helpful, for example to pass a white image (see ones) through the same transform. It may be convenient, here and in further parts, to access a matrix in Matlab through logical indexing: we can do operations such as, A(A > .5) = 1.

Finally, we must sum up the difference between images. There are many ways to implement this, such as using a couple of forloops, but Matlab is most efficient when it is operating on entire vectors or matrices at a time. Therefore, the suggested implementation is:

  1. Mask the region of the first image that overlaps with the second with a logical matrix that has 1s in overlap areas.
  2. Subtract the two images, scaling by the mask.
  3. Square each element (using the .* operation)
  4. Add up the elements in the resulting matrix (read up on the sum function)
  5. Divide by the number of elements (be careful--this is smaller than the size of the first image)

Test your alogrithm on some real and synthetic images, passing in candidate transformations by hand and making sure the results are reasonable. Make sure that the error is near zero for the appropriate transformations on provided synthetic images.


2. 1D Optimization

To start out, assume that the images differ only by translation in x. Implement a 1-dimensional optimization based on Golden Section search. Keep in mind that you will later use this function to optimize along y translation and rotation as well, so structure the code accordingly. Compared to the basic algorithm, there are two wrinkles to be aware of:

  1. You will need some way of establishing a bound initially - be careful that your procedure does not cause the images to not overlap. Consider the meaning of the initial interval in terms of the optimization.
  2. You will need to select an appropriate accuracy at which to stop--choose reasonably.

It may be helpful, in debugging this, create a plot of the x vs error, and trace the points your algorithm looks at.


3. 3D Optimization

Now, assume that the images may be translated in both x and y, and rotated. You will try to find the optimal alignment between them using the "Taxi Cab" or successive relaxation method. In this method, you first do a 1D minimization in x while holding y and rotation constant, then minimize in y, then in angle of rotation, and repeat until convergence. While this is not a particularly great method in general (it is particularly prone to ``ziz-zagging'' along valleys), it has the advantage of being simple to implement (you can reuse your 1D minimizer from above) and will work well for images with primarily horizontal and vertical features.


4. Output

Once you know the optimal alignment between image 1 and image 2, write a function that produces an output image and writes it out (using imwrite). The resulting image must include entirely the two input images. It does not have to be the tightest fit, but for finding the tightest fit, the findBounds function may be helpful. You can look at the code in imtransformSimple to see how to adjust image axes. The pixels in the output image should be taken as follows:


Images

Here are a few synthetic images. Your code should work on these.

Several test images will be provided here in the near future, or you are welcome to use your own (hint: images made with telephoto lenses will probably align better). Meanwhile, here are a few pictures to get you started. These were created by extracting sections from a larger image, so they should align almost perfectly:

Several pictures below were taken quickly with a camera, so they are more interesting cases. The following image pairs should align decently:

The following pair of images are difficult to align with our method:

An output mosaic may look like this:


Submitting

This assignment is due Tuesday, February 28 at 11:59 PM. Please see the general notes on submitting your assignments, as well as the late policy and the collaboration policy.

Please submit:


Last update 29-Dec-2010 11:59:13
smr at princeton edu