COS 526 - Programming Assignment 2

Advanced Computer Graphics, Fall 2016

Programming Assignment 2: RGBD Image Composition

Due on Sun, Nov 6 at 11:59PM

Overview

In this assignment you will implement an algorithm for composing multiple images into a single texture. The input to your algorithm will be: 1) a set of images, each with estimated camera parameters, and 2) a set of polygons (actually, usually just one). The output will be a diffuse albedo texture for the polygon computed by composing the input images.

The following is a list of features that you may implement -- the features in bold face are required. You may choose to implement any two of the others at your discretion. In addition to implementing these features, you should submit images generated by your program for each input scene to the quality bake-off. The winner will get extra credit and a note on the course web page.

Image selection:

Image selection per texel: choose k best images for each texel based on "quality of observation," considering image sharpness, viewing angle, viewing distance, etc.
Image selection per surface: chose k best images for each surface, considering both surface coverage and quality of observation. See Section 4 of [jeon16] for an example.

Texture color estimation:

Weighted Mean: estimate the RGB color of each texel by computing the weighted mean of pixels in selected images, where weights are based on the "quality of observation."
Weighted Median: estimate the RGB color of each texel by computing the weighted median of pixels in selected images. See Section 4 of [maier15] for an example.
Graph Cut Image Composition: compose a small set of selected images into an RGB texture using graph cuts along sharp edges to hide seams. See Section 4 of [kwatra03] for an example.
Poisson Image Composition: solve for the RGB texture by minimizing differences with gradients in images (solve a Poisson equation). See [perez03] for details.
Texture Synthesis: synthesize the texture using a quilting texture synthesis algorithm like [darabi12]. Note that this feature counts as two for this assignment -- i.e., if you implement this feature, you do not need to do a second one.

Surface creation:

Rectangle detection: create rectangular surfaces automatically by detecting planes in the input point clouds using a RANSAC algorithm.
Surface reconstruction: create a set of triangular surfaces by implementing any surface reconstruction algorithm discussed in class. You will also have to compute texture coordinates for each triangle and provide the mapping from texture to surface to demonstrate this feature.

Surface refinement:

Surface placement optimization: solve for better surface transformations (surface to world) by dense alignment of colors in textures.
Displacement Estimation: create and solve for a displacement map for each surface to optimize alignments of images mapped with displacements.

Image refinement:

Camera pose optimization: solve for better camera transformations (camera to world) by dense alignment of colors in textures.
Image warping: solve for a grid-based warp of each image plane to optimize alignment of colors in the textures. See [narayan15], [zhou14], and [zollhofer15] for examples. Note that this feature counts as two for this assignment.

Other:

View-dependence: create and solve for multiple textures on each surface, each associated with a range of viewpoints.

To get started, you can use the code in (cos526_assn2.zip). This C++ code provides the basic infrastructre for reading scenes, mapping images to polygons, etc. It also provides a simple program (texture) for viewing image configurations and making surface texture images. You will probably need to augment this program to include command line arguments of your own to turn on and off specific features and/or provide parameters for specific applications.

The skeleton code is able to read image configurations in a simple file format. This format was created to provide the features required by this assignment -- a set of rectangles, a set of images, and estimated camera parameters. We provide several example input scenes in that format in the "scans" subdirectory of this weburl. The raw datasets are quite large (tens to hundreds of megabytes). So, in addition to the raw datasets for individual scans, we provide zip files containing a regular subsampling of images for all datasets all together. We suggest that you start with the smallest zip file first (every100.zip) and move to the larger datasets only after your algorithms are working.

What to Submit

You should submit to CS dropbox one zip file named programming_assignment2.zip with the following internal directory structure to CS dropbox:

cos526_assn2/
- writeup.html (your writeup, see the description below)
- input/ (all the input data for the examples in your writeup)
- output/ (all the output images for the examples in your writeup)
- bakeoff/ (all images submitted for the bakeoff)
- src/ (the complete source code after "make clean")

writeup.html should be an HTML document demonstrating the effects of the features you have implemented. There should be one "section" per feature with a brief description of what you implemented and some images showing your results with a description of the command/process used to create the results in the caption. Wherever possible, you should show results for at least two sets of inputs.

The src directory should have all code required to compile and link your program (including the files provided with the assignment), along with a Makefile to rebuild the code.

Please DO NOT submit the input images as part of your zip file. Other images should be in JPEG format to save space. Also, to further save space, please remove binaries and backup files from the src directory (i.e., run make clean) before submitting.

Please see the course's webpage with submission instructions for more details.

Useful resources

Texture synthesis:
- [darabi12] Soheil Darabi, Eli Schechtman, Connelly Barnes, Dan Goldman, Pardeep Sen, "Image melding: comnbining inconsistent images using patch-based synthesis", SIGGRAPH 2012.
- [wexler07]Y. Wexler, E. Schechtman, M. Irani, "Space Time Video Completion," CVPR 2004.
- [efros01] Alexei Efros and Bill Freeman, "Image Quilting," SIGGRAPH 2001.
- [lefebvre10] Sylvain Lefebvre, Samuel Hornus, and Anass Lasram, "By-Example Synthesis of Architectural Textures," SIGGRAPH 2010.
Color optimization:
- [zhou14] Zhou, Qian-Yi and Koltun, Vladlen, "Color map optimization for 3D reconstruction with consumer depth cameras," ACM Transactions on Graphics, 2014.
- [narayan15] Narayan, Karthik S and Abbeel, Pieter, "Optimized color models for high-quality 3D scanning," Intelligent Robots and Systems (IROS), 2015.
- [zollhofer15] Zollhofer, Michael and Dai, Angela and Innmann, Matthias and Wu, Chenglei and Stamminger, Marc and Theobalt, Christian and Niessner, Matthias, "Shading-based refinement on volumetric signed distance functions," ACM Transactions on Graphics, 2015.
Other:
- [maier15] Maier, Robert and Stuckler, J{org and Cremers, Daniel, "Super-resolution keyframe fusion for 3D modeling with high-quality textures," IEEE Conference on 3D Vision (3DV), 2015.
- [turner15] Turner, Eric and Cheng, Peter and Zakhor, Avideh, "Fast, automated, scalable generation of textured 3d models of indoor environments," IEEE Journal of Selected Topics in Signal Processing, 2015.
- [dai16] Dai, Angela and Nie{\ss}ner, Matthias and Zollhofer, Michael and Izadi, Shahram and Theobalt, Christian, "BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration", arXiv preprint arXiv:1604.01093, 2016.
- [jeon16] Jeon, Junho and Jung, Yeongyu and Kim, Haejoon and Lee, Seungyong, "Texture map generation for 3D reconstructed scenes," The Visual Computer, 2016.
- [wasemuller16] Wasenmuller, Oliver and Meyer, Marcel and Stricker, Didier, "CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2," Winter Conference on Applications of Computer Vision (WACV), 2016.
- [engel16] Engel, Jakob and Usenko, Vladyslav and Cremers, Daniel, "A photometrically calibrated benchmark for monocular visual odometry", arXiv preprint arXiv:1607.02555, 2016.