COS 429: Computer Vision, Fall 2013
Assignment 3: Stereo Correspondence

Part 1: Thought Exercise

Please derive a formula for the sensitivity of the depth computed for a 3D point seen in a rectified pair of cameras with disparity "d" when the baseline is "B" and the focal length is "f". Specifically, how does the depth estimate change if the disparity changes by one pixel? Please use your answer to discuss which types of scenes and camera setups are most appropriate for stereo reconstruction.

Part 2: Programming Exercise

Your goal for this part of the assignment is to write a MATLAB program to compute pixel disparities between a stereo pair of images. The input will be a rectified pair of images (e.g., left.png and right.png), and the output is a matrix (shown in disparity.png) whose values indicate the disparity of the stereo correspondence for each pixel in the left image.

left.png right.png disparity.png

The stereo correspondence problem can be formulated as a minimization of the following energy function:

where data_term(y,x,d(y,x)) represents the cost of assigning disparity d(y,x) at pixel (y,x), and smoothness_term(d(y,x), d(ny,nx)) represents the cost of assigning disparities d(y,x) and d(ny,nx) at neighboring pixels. Intuitively, the first term measures the cost of aligning pixel (y,x) in the left image with pixel (y,x-d) in the right, and the second term measures the cost of introducing discontinuities in the disparity field.

Your task is to write a function "computedisparity" that estimates the disparities minimizing this energy function for a given stereo pair of images:

disparity = computedisparity(input_image_l, input_image_r, data_term_algorithm, smoothness_term_algorithm, optimization_algorithm)

where "input_image_l" and "input_image_r" are the left and right input images, respectively, each represented by a matrix with double values between 0 and 255, and the output is a new image providing the computed disparity for each pixel of the left image. The last three input arguments are strings indicating which algorithms to use during the computation (as described below). For this assignment, you can assume that the input images have been rectified (i.e., epipolar lines are horizontal) and that the disparity in any input image pair is at most 60 pixels (i.e., max_disparity = 60).

This task can be broken down into three steps:

Step A: Compute the data term
data_term = compute_data_term(input_image_l, input_image_r, max_disparity, data_term_algorithm, ... other parameters ...)
should produce a matrix, "data_term," with dimensionality height x width x (max_disparity+1), where data_term(y, x, d) indicates the cost of assigning disparity d to pixel (y, x). The "data_term_algorithm" parameter indicates the method to be used to compute the data term, which can be any of the following:

'default' or 'L1': computes the data_term(y,x,d) by first finding the average difference between the luminances of pixels within a square neighborhood with radius "window_radius" (0 means just a single pixel) centered at (y,x) in input_image_l and the same neighborhood centered at (y,x-d) in input_image_r. Then, it clamps the values so that they do not exceed "max_data_term_value" (to clamp the effect of very large luminance differences) and scales the result by a weight "lambda" (to balance the influence of the data term versus the smoothness term). For example, if "window_radius" is zero, then data_term(y,x,d) = lambda * min( | input_image_l(y,x) - input_image_r(y,x-d) |, max_data_term_value). We suggest that you use max_data_term_value = 10 and lambda = 0.04.

'awesome': a method of your own design

Step B: Compute the smoothness term
smoothness_term = compute_smoothness_term(max_disparity, smoothness_term_algorithm, ... other parameters ...)
should produce a matrix, "smoothness_term," with dimensions (max_disparity+1) x (max_disparity+1), where smoothness_term(d1, d2) indicates the cost of assigning disparities d1 and d2 to neighboring pixels. The "smoothness_term_algorithm" parameter indicates the method to be used to compute the smoothness term, which can be any of the following:

'default' or 'L1': computes smoothness_term(y,x,d) by first taking the difference between d1 and d2, and then clamping it by "max_smoothness_term_value" (to limit the influence of very large disparity differences). Specifically, smoothness_term(d1,d2) = min( | d1 - d2 |, max_smoothness_term_value). We suggest that you use max_smoothness_term_value = 1.7.

'awesome': a method of your own design

Step C: Minimize the energy
disparity = optimize_energy(data_term, smoothness_term, optimization_algorithm, ... other parameters ...)
should produce a matrix, "disparity," with dimenstionality height x width, where disparity(y,x) stores the disparity d at every pixel (y,x) in input_image_l that minimize the energy function. The "optimization_algorithm" parameter indicates the method to be used for the optimization, which can be any of the following:

'default' or 'dynamic_programming': minimize E(disparity) separately for each horizontal scanline by implementing the dynamic programming algorithm described in class.

'graph_cut': minimize E(disparity) using the graph cut algorithm described by Boykov et al. as implemented in the GCMex library provided in cos429_assignment3.zip. You can find more details on how to use this matlab wrapper from here.

'awesome': a method of your own design

Part 3: Results and Analysis

Please run your program on all of the stereo pairs of test images in the "input" subdirectory of cos429_assignment3.zip using both the 'dynamic_programming' and 'graph_cut' algorithms.

Please include results of these runs in a third section of your writeup titled "Results and Analysis." In addition to showing disparities computed with both algorithms, please provide a short discussion of how well each algorithm worked and what are its typical failure modes. You do not need to evaluate results for different parameter settings, though you may wish to do so in support of your comparisons.

To facilitate your evaluations and comparisons, we provide 'ground truth' results for each of the test images and an evaluation metric to measure how well the disparities computed by your program match the ground truth. The evaluation metric is the root mean squared error (square root of the mean of the squared differences) of the disparities all pixels, excluding the left-most pixels where no correspondences are possible.

Part 4: Experimentation

The previous part of the assignment describes a pipeline for computing stereo correspondences (disparities). However, there are multiple possible implementations for each of the steps. For this part of the assignment, we would like you to experiment with different design choices and evaluate how well different algorithms work.

Specifically, please implement an algorithm of your own choice to improve at least one of the steps of the pipeline -- i.e., create an 'awesome' algorithm for at least one step. The "Experimentation" section of your writeup should include a description of your modification, an explanation of why you chose it, and an analysis of how and why your 'awesome' algorithm improves or hurts the results compared to the otherwise best combination of algorithms. Please use both qualitative evaluations (side-by-side comparisons of disparities) and quantitative analysis (root mean square errors) of your results for all the test input image pairs to support your analysis.

Getting Started

You should start from cos429_assignment3.zip, which provides the directory structure to follow in your submission, a template for your writeup (writeup/writeup.html), a set of test input images (input/*.png), corresponding ground truth disparity matrices (groundtruth/*.mat), a simple template for the MATLAB functions you must implement (code/compute*.m), a MATLAB library for computing graph cuts (GCMex/*), and a script to ease creating results for your writeup (code/runme.m).

You should fill in implementations of the required algorithms in the provided MATLAB files (and possibly create new ones), execute runme.m to produce your results, and complete your writeup.

Submitting Your Solution

Please submit your solution via the dropbox link here.

Your submission should include a single file named "assignment3.zip" with the following structure:

A subdirectory named "code" containing your source code. You should not change the API of the MATLAB functions provided, but you may add new MATLAB files as you see fit.
A subdirectory named "groundtruth" containing all groundtruth files in ".mat" format (this is provided).
A subdirectory named "input" containing all test input files in ".jpg" format (include both your new examples and the test images provided).
A subdirectory named "disparity" containing all disparity images produced by your program (should be produced automatically by runme.m).
A subdirectory named "error" containing all error images and plots produced by your program (should be produced automatically by runme.m).
A subdirectory named "writeup" containing a file named "writeup.html" containing separate sections titled: (you can start from the template provided in the .zip file):
1. Thought exercise: your answer to the question in Part 1.
2. Programming exercise: a brief statement of what works and does not work in your implementation of Part 2.
3. Results and Analysis: display and discussion of qualitative results (images) and quantitative results (root mean squared errors) produced by your program for different inputs, algorithmic modifications and (optionally) different parameter settings.
4. Experimentation: description of what 'awesome' algorithmic option you chose to implement, why you chose it, and how it affected the results.
5. Conclusion: summary of your findings.
6. Notes: any other comments, description of any assistance received and/or any discussions had with other students.

Please follow the general policies for submitting assignments, including the late policy and collaboration policy.

COS 429: Computer Vision, Fall 2013 Assignment 3: Stereo Correspondence