COS 429: Computer Vision, Fall 2013 |
Your goal for this part of the assignment is to write a MATLAB program to compute pixel disparities between a stereo pair of images. The input will be a rectified pair of images (e.g., left.png and right.png), and the output is a matrix (shown in disparity.png) whose values indicate the disparity of the stereo correspondence for each pixel in the left image.
left.png right.png disparity.png
The stereo correspondence problem can be formulated as a minimization of the following energy function:
where data_term(y,x,d(y,x)) represents the cost of assigning disparity d(y,x) at pixel (y,x), and smoothness_term(d(y,x), d(ny,nx)) represents the cost of assigning disparities d(y,x) and d(ny,nx) at neighboring pixels. Intuitively, the first term measures the cost of aligning pixel (y,x) in the left image with pixel (y,x-d) in the right, and the second term measures the cost of introducing discontinuities in the disparity field.
Your task is to write a function "computedisparity" that estimates the disparities minimizing this energy function for a given stereo pair of images:
disparity = computedisparity(input_image_l, input_image_r, data_term_algorithm, smoothness_term_algorithm, optimization_algorithm)where "input_image_l" and "input_image_r" are the left and right input images, respectively, each represented by a matrix with double values between 0 and 255, and the output is a new image providing the computed disparity for each pixel of the left image. The last three input arguments are strings indicating which algorithms to use during the computation (as described below). For this assignment, you can assume that the input images have been rectified (i.e., epipolar lines are horizontal) and that the disparity in any input image pair is at most 60 pixels (i.e., max_disparity = 60).
This task can be broken down into three steps:
Step A: Compute the data term
data_term = compute_data_term(input_image_l, input_image_r, max_disparity, data_term_algorithm, ... other parameters ...)
should produce a matrix, "data_term," with dimensionality height x width x (max_disparity+1), where data_term(y, x, d) indicates the cost of assigning disparity d to pixel (y, x). The "data_term_algorithm" parameter indicates the method to be used to compute the data term, which can be any of the following:
- 'default' or 'L1': computes the data_term(y,x,d) by first finding the average difference between the luminances of pixels within a square neighborhood with radius "window_radius" (0 means just a single pixel) centered at (y,x) in input_image_l and the same neighborhood centered at (y,x-d) in input_image_r. Then, it clamps the values so that they do not exceed "max_data_term_value" (to clamp the effect of very large luminance differences) and scales the result by a weight "lambda" (to balance the influence of the data term versus the smoothness term). For example, if "window_radius" is zero, then data_term(y,x,d) = lambda * min( | input_image_l(y,x) - input_image_r(y,x-d) |, max_data_term_value). We suggest that you use max_data_term_value = 10 and lambda = 0.04.
- 'awesome': a method of your own design
Step B: Compute the smoothness term
smoothness_term = compute_smoothness_term(max_disparity, smoothness_term_algorithm, ... other parameters ...)
should produce a matrix, "smoothness_term," with dimensions (max_disparity+1) x (max_disparity+1), where smoothness_term(d1, d2) indicates the cost of assigning disparities d1 and d2 to neighboring pixels. The "smoothness_term_algorithm" parameter indicates the method to be used to compute the smoothness term, which can be any of the following:
- 'default' or 'L1': computes smoothness_term(y,x,d) by first taking the difference between d1 and d2, and then clamping it by "max_smoothness_term_value" (to limit the influence of very large disparity differences). Specifically, smoothness_term(d1,d2) = min( | d1 - d2 |, max_smoothness_term_value). We suggest that you use max_smoothness_term_value = 1.7.
- 'awesome': a method of your own design
Step C: Minimize the energy
disparity = optimize_energy(data_term, smoothness_term, optimization_algorithm, ... other parameters ...)
should produce a matrix, "disparity," with dimenstionality height x width, where disparity(y,x) stores the disparity d at every pixel (y,x) in input_image_l that minimize the energy function. The "optimization_algorithm" parameter indicates the method to be used for the optimization, which can be any of the following:
- 'default' or 'dynamic_programming': minimize E(disparity) separately for each horizontal scanline by implementing the dynamic programming algorithm described in class.
- 'graph_cut': minimize E(disparity) using the graph cut algorithm described by Boykov et al. as implemented in the GCMex library provided in cos429_assignment3.zip. You can find more details on how to use this matlab wrapper from here.
- 'awesome': a method of your own design
Please run your program on all of the stereo pairs of test images in the "input" subdirectory of cos429_assignment3.zip using both the 'dynamic_programming' and 'graph_cut' algorithms.
Please include results of these runs in a third section of your writeup titled "Results and Analysis." In addition to showing disparities computed with both algorithms, please provide a short discussion of how well each algorithm worked and what are its typical failure modes. You do not need to evaluate results for different parameter settings, though you may wish to do so in support of your comparisons.
To facilitate your evaluations and comparisons, we provide 'ground truth' results for each of the test images and an evaluation metric to measure how well the disparities computed by your program match the ground truth. The evaluation metric is the root mean squared error (square root of the mean of the squared differences) of the disparities all pixels, excluding the left-most pixels where no correspondences are possible.
The previous part of the assignment describes a pipeline for computing stereo correspondences (disparities). However, there are multiple possible implementations for each of the steps. For this part of the assignment, we would like you to experiment with different design choices and evaluate how well different algorithms work.
Specifically, please implement an algorithm of your own choice to improve at least one of the steps of the pipeline -- i.e., create an 'awesome' algorithm for at least one step. The "Experimentation" section of your writeup should include a description of your modification, an explanation of why you chose it, and an analysis of how and why your 'awesome' algorithm improves or hurts the results compared to the otherwise best combination of algorithms. Please use both qualitative evaluations (side-by-side comparisons of disparities) and quantitative analysis (root mean square errors) of your results for all the test input image pairs to support your analysis.
You should fill in implementations of the required algorithms in the provided MATLAB files (and possibly create new ones), execute runme.m to produce your results, and complete your writeup.
Please submit your solution via the dropbox link here.
Your submission should include a single file named "assignment3.zip" with the following structure:
code
" containing your source code.
You should not change the API of the MATLAB functions provided, but you may
add new MATLAB files as you see fit.
groundtruth
" containing all
groundtruth files in ".mat" format (this is provided).
input
" containing all test
input files in ".jpg" format (include both your new examples and
the test images provided).
disparity
" containing all disparity
images produced by your program (should be produced automatically by runme.m).
error
" containing all error images
and plots produced by your program (should be produced automatically by runme.m).
writeup
" containing a file
named "writeup.html
" containing separate sections titled:
(you can start from the template provided in the .zip file):
Please follow the general policies for submitting assignments, including the late policy and collaboration policy.