COS 429: Computer Vision, Fall 2013 |
Your goal for this part of the assignment is to write a MATLAB program to create an image mosaic out of two overlapping input images. For example, the two images shown on the left below have been "stitched" into the panorama shown on the right.
Input1 Input2 Output Panorama
Creating a panoramic image requires mapping one image plane to the other. Since in general we do not know how to relate the position and orientation of the two camera views, we will use image features techniques discussed in class to recover the underlying mapping. First, we will identify salient feature points in both images. Then, we will find correspondences between those feature points. Next, we will compute a transformation that maps corresponding feature points onto one another. Finally, once we have the transformation, we can warp one image onto the other and compose the two images to generate the final result.
These steps can be coded in MATLAB with the following functions (detailed descriptions of input and output variables appear in the code skeleton provided in the cos429_assignment2.zip file). Please implement at least the ones marked in bold:
Step A: Feature detection
features = detectfeatures(input_image, max_features, algorithm)
produces a 4xF matrix representing the 2D locations, scales, and orientations of salient feature points in the input image, where F is the number of features detected (F <= max_features) and algorithm can be any of the following:
- 'random': return features at random positions, scales, and orientations (provided).
- 'sift': return features detected by SIFT (provided).
- 'harris': return the strongest features computed with the Harris corner detector. For each returned corner feature, the scale should be set based on the window size (and possibly downsampling factor of the input image) and the orientation should align with the eigenvector associated with the largest eigenvalue of the covariance matrix computed for the feature.
- 'awesome': return features computed with your own algorithm.
Step B: Feature description
descriptors = computedescriptors(input_image, features, algorithm) produces a KxF descriptor for each feature, where F = size(features) and K is the size of the descriptor, which will be different for different algorithms (e.g., K=128 for SIFT).
- 'random': return a KxF matrix of random values (provided).
- 'sift': return a 128xF matrix containing the SIFT descriptor for each feature as computed by SIFT (provided).
- 'window': return a (k*k)xF matrix containing the luminance sampled on a kxk rectangular grid of locations centered at the feature position after the neighborhood has been scaled and rotated according to the scale and orientation of the feature (e.g., k=7).
- 'awesome': return descriptors computed with your own algorithm.
Step C: Feature matching
matches = findmatches(features1, descriptors1, features2, descriptors2, max_matches, algorithm) produces a 2xM matrix of integers representing the indices of the features in features1 and features2 that provide the best pairwise matches, where M is the number of detected matches (M <= max_matches), and algorithm is one of the following:
- 'random': return a 2xM matrix of random values (provided).
- 'mutual': return all matches between features i1 and i2 where the L2 distance between descriptors i1 and i2 is less than the L2 distance between descriptor i1 and any other in descriptors2 AND also less than the L2 distance between descriptor i2 and any other in descriptors1 (note: this is not required).
- 'ratio': return all matches between features i1 and i2 where the L2 distance between descriptors i1 and i2 is less a constant match_ratio times the L2 distance between descriptor i1 and second best match in descriptors2 (e.g., match_ratio=0.6).
- 'awesome': return matches computed with your own algorithm.
Step D: Feature correspondence
correspondences = findcorrespondences(features1, descriptors1, features2, descriptors2, matches, algorithm) produces a 2xC matrix of integers representing the indices of the features from features1 and features2 that provide the best set of correspondences consistent with a homography transformation among the provided matches, where C is the number of correspondences found, and algorithm is one of the following:
- 'random': return a 2xM matrix of random values (provided).
- 'RANSAC': return the best set of inlier feature correspondences found with the RANSAC algorithm. Choose the number of RANSAC iterations carefully to ensure that the best homography is likely to be found.
- 'awesome': return correspondences computed with your own algorithm.
Step E: Homography estimation
transform = computetransform(features1, features2, correspondences, algorithm, groundtruth_filename) produces a 3 x 3 matrix representing the homgraphy that best aligns the provided correspondences.
- 'groundtruth': return the best homography transformation computed from the groundtruth data (provided)
- 'cp2tform': return the homography transformation computed from the given feature correspondences with 'cp2tform' (provided)
Step F: Image composition
output_image = compositeimage(input_image1, input_image2, transformation, algorithm) warps input_image1 by the given transform and the composites (merges) it with input_image2 using one of these algorithms:
- 'simple': Returns a composite image where input_image1 is warped by the transformation and then copied over input_image2 (provided).
- 'overlay': Returns a composite image useful for visualizing errors in the mosaic. Gray indicates areas where the two images are similar after warping and compositing, and green/magenta indicates areas where they are different (provided).
- 'awesome': return the image composited with your own algorithm.
Implementations for many of these steps are provided for you (see notes in parentheses). Your main tasks are to implement the 'harris' algorithm for Step A, the 'window' algorithm for Step B, the 'ratio' algorithm for Step C, and the 'RANSAC' algorithm for step D. Implementing the 'mutual' algorithm for Step C is optional.
The previous part of the assignment describes a pipeline for image mosaicing. There are multiple possible implementations for each of the steps. For this part of the assignment, we would like you to experiment with different design choices and evaluate how well different algorithms work.
Specifically, please implement an algorithm of your own choice to improve at least ONE of the steps -- i.e., create an 'awesome' algorithm for any step (except E). The "Experimentation" section of your writeup should include a description of your modification, an explanation of why you chose it, and an analysis of how and why your 'awesome' algorithm improves or hurts the results.
Please think carefully about how to design the experiment to test whether your modification improves the results. At the very least, you should show images and quantiative evaluations comparing results on a small set of images with your `awesome` algorithm compared to other options for the same step using the same combinations of options for other steps.
You should execute your program using runme.m on a variety of input test image pairs to investigate how well it works with different combinations of algorithms and under different input conditions.
First, please show outputs of your program for all of the images in the "input" subdirectory of cos429_assignment2.zip and at least one pair of images taken with your own camera using the 'sift', 'sift', 'ratio', 'RANSAC', 'cp2tform', and 'simple' options. Note that some of these inputs are HARD, and so you should not expect to get perfect results for all of them.
Second, please compute and compare results for a small set of images (of your choosing) with the four possible combinations of using the 'sift' and 'corner' options for feature detection and the 'sift' and 'window' options for feature description (along with 'ratio', 'RANSAC', 'cp2tform', and 'simple'). For each of the four combinations, please provide overlay images and quantative evaluations to compare the results in your writeup -- these comparisons can inform your answer to the thought exercise in Part 1.
Please include these results in a third section of your writeup titled "Results and Analysis." In addition to showing images, please provide a short discussion of how well your program works. Overall, the goal of this section is to answer questions like: When does your program succeed? When does it fail? Which step(s) are failing when the program fails? What are the key attributes of input images that affect its success? What parameter settings affect its success? You do not have to answer all of these questions, but you should discuss at least one characteristic of the input image pairs and/or parameters that affects the quality of your results, demonstrated by results for a set of input pairs spanning cases where the algorithm does and doesn't work well.
To facilitate your evaluations and comparisons, we provide 'ground truth' results for each of the test images and an evaluation metric to measure how well the homography computed by your program matches the ground truth. The evaluation metric is the average distance between the position to which pixels in input_image1 are mapped by your program versus the position to which they are mapped by the ground truth homography (see computeerror.m). You can create ground truth data for your own examples using runme_generategroundtruth.m.
You should edit the MATLAB files (and possibly create new ones) to implement the algorithms, download new test images into the input subdirectory, execute runme.m to produce your results, and complete your writeup.
Please submit your solution via the dropbox link here.
Your submission should include a single file named "assignment2.zip" with the following structure:
code
" containing your source code.
You should not change the API of the MATLAB functions provided, but you may
add new MATLAB files as you see fit.
groundtruth
" containing all
groundtruth files in ".mat" format (this is provided).
input
" containing all test
input files in ".jpg" format (include both your new examples and
the test images provided).
features
" containing images
overlaid by marks for features created with detectfeatures.m
(should be produced automatically by runme.m).
matches
" containing images
overlaid by lines between matches found with findmatches.m
(should be produced automatically by runme.m).
correspondences
" containing images
overlaid by lines between correspondences found with findcorrespondences.m
(should be produced automatically by runme.m).
output
" containing all
images produced by compositeimages (should be produced automatically by runme.m).
writeup
" containing a file
named "writeup.html
" containing separate sections titled:
(you can start from the template provided in the .zip file):
Please follow the general policies for submitting assignments, including the late policy and collaboration policy.