COS 429 - Computer Vision |
Fall 2005 |
Course home | Outline and lecture notes | Assignments |
Note that Thursday Nov. 24 and Friday Nov. 25 will not be counted as
late days for this assignment.
Submitting on Saturday will count as
two days late.
(Note: the previous part (a) was a bad question - it has been removed from the assignment.
For this question, you'll evaluate the use of RGB vs. HSV for skin color detection. To do this, start by looking back at your database of faces from Assignment 2. For each image, crop it tightly around the face and convert it to HSV using Matlab's rgb2hsv function (assume that the image gamma is 2.2, so you need to convert RGB to the 0..1 range, then raise the values to this power). Assemble all the pixels from all the images into 6 vectors, one for each of R, G, B, H, S, and V (hint: reshape). Plot the histogram for each of the 6 color channels (hint: use the hist command, but you probably want more bins than the default of 10). Comment on what you see, and the implications for skin color detection and tracking. Please include the histogram plots in your writeup.
The remainder of the assignment deals with implementing a blob tracker for objects in video sequences. The first cue you will investigate is color histograms. Implement the following:
imread(sprintf('img%02d.jpg', i));
(Note: simply using the value of the histogram looked up for each pixel's hue is good enough - you don't need to use the full Bayes's rule to evaluate a probability...)
The next thing to implement is background subtraction: for each frame, find the absolute color difference between the frame and the background. For some of the datasets the background frame will be given, while for others you should determine the background from the video frames themselves by taking the median of all the frames.
Allow the user to specify any number of objects to be tracked in the first frame, and, as shown in class, propagate their locations forward through the video using expectation maximization and an anisotropic Gaussian mixture model (details here, here, here, here, and here). You should be able to use the results of steps 2 or 3 above, or use a combination of both cues by multiplying the outputs together.
Implement simple prediction by keeping track of the velocity of each blob on each frame, and updating the velocity using a rolling average. Show that this allows you to keep track of objects more robustly than not using prediction (if necessary, skip frames in the datasets, or construct synthetic datasets).
Show sample outputs of each stage of the tracking, as well as at least one complete tracked sequence (where you have marked each object at each frame by overlaying a box on the video frame - make sure to use a different color for each object being tracked). If you want, you can draw nice ellipses using this .m file.
Here are a few datasets. Stay tuned for more...