Programming Assignment 3: Shape2Pose
Due on Wed Nov 30 at 11:59PM
Overview
In this assignment you will implement an algorithm for predicting the
pose a human might assume when interacting with a given object. The
input to your algorithm will be: 1) a 3D surface mesh, 2) an
articulated body, and (optionally) 3) a set of mesh-pose pairs to use
as training examples. The output will be a set of predicted poses
(represented by a set of joint pivot angles) that the
articulated figure might take when interacting with the surface.
Ideally, the predicted poses (red) output by your program match the
provided ground truth poses with small error.
data:image/s3,"s3://crabby-images/2d0b0/2d0b0ad90b8ff789cbb65ee4d70d9cfff2462d95" alt=""
The following is a list of
features that you may implement -- the features in bold face are
required. You may choose to implement any of the others at your discretion, but they are not required.
Templates for all functions are in render.cpp unless otherwise noted.
In addition to implementing these features, you should submit images
generated by your program to the art contest. The winner will get
extra credit and a note on the course web page.
Analyzing the Mesh:
- Compute local surface properties: Rewrite CreateMeshProperties to compute a set of local properties for each vertex of the mesh that will be helpful for discriminating which joints should be in contact with it. Your property set can include "angle between the normal vector and the up direction," which is already implemented for you (note: you can assume that the positive Z axis is "up" and that the ground is at Z=0). It should also include at least 3 other properties of your own choice, amongst ones we discussed in class or ones of your own design. For each property, please explain in your writeup why you chose it and why you think it should discriminate mesh vertices in contact with different joint types. Please include an image of each property in your writeup (obtained by hitting 'v' to display properties and then using the up/down arrow keys to select which property is shown and left/right arrow keys to adjust the range of displayed values).
- Detect global planar symmetries: Rewrite CreateMeshSymmetries to compute a (set of) symmetry planes for the mesh. The symmetry planes should have the property that reflecting the mesh across the plane maps the surface onto itself. In implementing this feature, you can leverage the fact that all perfect complete symmetries contain the R3Mesh::Centroid and have a normal aligned with one of the R3Mesh::PrincipleAxes. Testing for mapping of the surface onto itself can be achieved by reflecting vertices and checking for distance to the closest surface point with the R3MeshSearchTree::ClosestPoint. Note that not all meshes are globally symmetric (e.g., the car interiors). Your function should not add any symmetry planes in those cases. Your writeup should include images of symmetry planes detected by your algorithm (which you can create by hitting 's' during interactive viewing or by including "-show_mesh_symmetries" on the command line).
Evaluating the Cost of a Pose:
- Contact distance cost: Implement a function called ContactDistanceCost that returns the square root of the sum of squared distances between the positions of a pose's joints and their corresponding contact points on the mesh. This cost term penalizes poses where end effector joints are far from their contact points on the mesh surface.
- Contact compatibility cost: Implement a function called ContactCompatibilityCost that returns the sum of values computed for each mesh vertex based on how joint(s) of the given pose are in contact with it. The function should consider how likely the mesh properties at the vertex are at contacts with the particular joint type and/or it should penalize cases where mesh vertices are not contacted by any joint. You can implement a "hand-tuned" function (e.g., vertices with ZNormal=-1.0 should not be in contact with feet). You may also learn a function for each joint type based on statistics of mesh properties observed in contacts of a training set using any stats/ML toolkit you like.
- Pivot angle cost: Implement a function called PivotAngleCost that computes a cost for each pivot angle of a human pose. The value of the function should be small for typical human poses, larger for less likely ones, and RN_INFINITY for infeasible ones (e.g., a pivot angle is less than the AFAngle::Minimum()). You can start with a "hand-tuned" function. However, you should ultimately base your function on statistics of pivot angles observed in a training set. You can split the given input data into separate training and test sets (for each object category separately), and then gather a distribution of observed pivot angles for each joint by analyzing the training set. You function then could be based on the probability distribution of angles observed (remembering that lower costs represent higher probabilities).
- Joint symmetry cost: Implement a function called JointSymmetryCost that penalizes poses where pairs of symmetric joints (e.g., right_shoulder and left_shoulder) have positions inconsistent with other pairs of symmetric joints (e.g., the vector between the shoulders is in direction (1,0,0) and the vector between the hips is in direction (0,-1,0). Pairs of symmetric joints can be accessed with AFBody::Symmetries.
- Mesh symmetry cost: Implement a function called MeshSymmetryCost that penalizes poses where pairs of symmetric joints have AFPose::JointPosition(s) that are not symmetric with respect to the AFScene::MeshSymmetryPlane(s).
- Mesh intersection cost: Implement a function called MeshIntersectionCost that penalizes intersections of a pose with the surface of the mesh.
- Inertia cost: Implement a function called InertiaCost that penalizes larger changes in the pose parameters (root position, root orientation, and pivot angles) versus the previous parameter values. This function will be useful for iterative pose refinement (described later).
Searching for Poses with Minimal Cost:
- Refine predicted poses: Implement an iterative algorithm that optimizes a given set of poses with a goal of minimizing the cost function. The function should take small iterative steps in a direction of lower cost until a local minimum if found.
- Search with fixed pivot angles: Implement an algorithm that searches for the pose(s) with least cost using a discrete set of candidate poses (e.g., selected from the ground truth of other inputs). For each candidate pose, search for the root position and orientation that minimizes the overall cost without changing the pivot angles. Keep only the pose(s) with least overall cost.
- Search contacts, infer poses: Implement an algorithm that searches for the pose(s) with least cost by generating candidate sets of joint-mesh contacts. For each set of contacts, solve for the pose that minimizes overall cost while maintaining those contacts. Keep only the pose(s) with least overall cost.
- Search jointly: Implement an algorithm that searches for contacts and poses together, possibly by alternating between selecting one contact, then one pivot angle, and so on.
To get started, you can use the code in (cos526_assn3.zip). This C++ code provides
the basic infrastructre for reading scenes, mapping images to polygons,
etc. It also provides a simple program (shape2pose) for
viewing image configurations and making surface texture images.
You will probably need to augment this
program to include command line arguments of your own to turn on and
off specific features and/or provide parameters for specific
applications.
The skeleton code is able to read meshes, mesh properties, body
skeletons, and body poses in ASCII file formats provided in the input
directory. You should not be concerned with the file formats, since
the support code handles all the reading and writing of them.
However, if you want to learn more about the raw data, please refer to
Vladimir Kim's web page here.
The input directory of the zip file provides test data for
several object categories (bikes, chairs, etc.) and dozens of examples.
For each input example E of category C, there is a mesh in input/meshes/C/E.off,
a ground truth pose in input/ground_truth_poses/C/E.pose, and example
predictions made by Vladimir (Vova) Kim in input/vovas_predicted_poses/C/E.pose.
The scripts directory of the zip file provides simple BASH scripts
to run your program in batch mode on all the provided examples. "makeposes"
will run shape2pose with parameters to create estimated poses and write them
to output/C/E.pose. "makeimages" will read the predicted poses and write
images of them to output/C/E.jpg (popping up a window for every example).
"make errors" will read the log files produced by "makeposes" and concatenate
the errors into a single table in output/errors.txt.
What to Submit
You should submit to CS dropbox one zip file named
programming_assignment3.zip
with the following internal directory structure
to CS dropbox (note that you do not need to include the provided input data):
cos526_assn3/
writeup.html
(your writeup, see the description below)
output/
(all the output images for the examples in your writeup)
art/
(all images submitted for the art contest)
src/
(the complete source code after "make clean")
writeup.html
should be an HTML document demonstrating
the effects of the features you have implemented. There should be one
"section" per feature with a brief description of what you
implemented and some images showing your results with a description of
the command/process used to create the results in the caption.
Wherever possible, you should show numerical results for a test set of inputs,
demonstrating the difference made by your algorithmic choice as compared
to a simpler alternative.
The src
directory should have all code required to
compile and link your program (including the files provided with the
assignment), along with a Makefile to rebuild the code.
Please DO NOT submit the provided input data as part of your zip file.
Output images should be in JPEG format to save space. Also, to
further save space, please remove binaries and backup files from the
src directory (i.e., run make clean
) before submitting.
Please see the course's webpage with
submission instructions
for more details.
Useful resources
- Papers:
- [kim14]Shape2Pose: Human-Centric Shape Analysis, Vladimir Kim, Siddhartha Chaudhuri, Leonidas Guibas, and Thomas Funkhouser, SIGGRAPH 2014.
- Shape2Pose Code and Data:
- Project Web Page = Web page with links to slides, supplemental material, etc.
- Code = original code written by Vladimir Kim for the Shape2Pose paper. You are welcome to look at this code, but do not copy from it.
- Data = original data provided by Vladimir Kim for the Shape2Pose paper. This web page has more data than provided in the input folder of your assignment, plus it has documentation about the file formats.
- Our Software Infrastructure:
- GAPS = Github repository for the code infrastructure, including source code for many example programs like msh2prp.