COS 429 - Computer Vision

Fall 2017

Course home

Outline and Lecture Notes

Assignments

Featured Projects

Assignment 1: Image processing and feature detection

Due Thursday, Oct. 5

1. Pinhole camera model (5 pts)

Consider a pinhole camera with focal length \( f \). Let this pinhole camera face a whiteboard, in parallel, at a distance \( L \) between the whiteboard and the pinhole. Imagine a square of area \( S \) drawn on the whiteboard. What is the area of the square in the image? Justify your answer.

2. Linear filters (20 pts)

You are expected to do this question by hand. Show all steps for full credit.

In class we introduced 2D discrete space convolution. Consider an input image \( I[i,j] \) with an \(m \times n \) filter \(F[i,j]\). The 2D convolution \(I * F\) is defined as: \[ (I*F)[i,j] = \sum_{k,l}I[i-k,j-l]F[k,l] \] Note that the above operation is run for each pixel \((i,j)\) of the result.

Convolve the 2x3 matrix \(I = [-1, 0, 2; 1, -2, 1] \) with the 2x2 matrix \(F = [-1, -1; 1, 1] \). Use zero-padding when necessary.
Note that \(F\) is separable, i.e., it can be written as a product of two 1D filters: \(F_1 = [-1; 1]\) and \(F_2 = [1, 1]\). Compute \((I*F_1)\) and \((I * F_1) * F_2\), i.e., first perform 1D convolution on each column, followed by another 1D convolution on each row.
Prove that for any separable filter \(F = F_1F_2\): \[I*F = (I*F_1)*F_2\] Hint: expand the 2D convolution equation directly.

3. Difference-of-Gaussian (DoG) detector (25 pts)

Recall that a 1D Gaussian is: \[g_{\sigma}(x) = \frac{1}{\sqrt{2\pi}\sigma}\exp \left (-\frac{x^2}{2\sigma^2} \right ) \] Calculate the 2nd derivative of the 1D Gaussian with respect to \(x\) and use Matlab to plot it (use \(\sigma=1\)). Submit all steps of your derivation and the generated plot.
Use Matlab to plot the difference of Gaussians in 1D given by \[D(x,\sigma,k) = \frac{g_{k\sigma}(x)-g_{\sigma}(x)}{k\sigma-\sigma}\] using k = 1.2, 1.4, 1.6, 1.8, 2.0. State which value of \(k\) gives the best approximation to the 2nd derivative with respect to \(x\). Assume \(\sigma=1\). Submit both the answer and your code. You may paste your code as a ``code block'' into the pdf, or refer to included .m functions.
- Anonymous functions in Matlab are cool, particularly combined with .* and .^ operators to enable them to operate on vectors. 2pts extra credit for using one or more vectorized anonymous functions in your code.
The 2D equivalents of the plot above are rotationally symmetric. To what type of image structure will the difference of Gaussian respond maximally?

4. Canny edge detector (50 pts)

Background: See the lecture slides, Sections 4.1-4.3 of Trucco & Verri, and Chapter 4 of your textbook. Make sure you've completed at least the first part of Assignment 0 before beginning this question.
Hint #1: Take advantage of the fact that you're working with visual data, and visualize every step of your work. Try "help imagesc" for visualizing gradient images.
Hint #2: Start by working with small images -- for example, by cropping out a 50x50-pixel part of a larger image.

Implement the Canny edge detection algorithm, as described in class. The framework code you should start from is here. This consists of several phases:
- Filtered gradient:
  - Load an image
  - Compute the Fx and Fy gradients of the image smoothed with a Gaussian with a user-supplied width sigma.
  - Compute the edge strength F (the magnitude of the gradient) and edge orientation D = arctan(Fy/Fx) at each pixel.
  Hint #1: Make sure your image is grayscale and floating point (c.f., assignment 0)
  Hint #2: Recall that a 2D Gaussian is separable, and that finding the derivative of a function convolved with a Gaussian is the same as convolving with the derivative of a Gaussian. You should use these facts in your code.
  Hint #3: "help conv2", especially the syntax for independently convolving rows and columns.
- Nonmaximum suppression:
  Create a "thinned edge image" I(x,y) as follows:
  1. For each pixel, find the direction D* in (0, 45, 90, 135) that is closest to the orientation D at that pixel. (Or the equivalent in radians...)
  2. If the edge strength F(x,y) is smaller than at least one of its neighbors along D*, set I(x,y) = 0, else set I(x,y) = F(x,y).
- Hysteresis thresholding:
  Repeatedly do the following:
  1. Locate the next unvisited pixel (x,y) such that I(x,y) > \(T_h\).
  2. Starting from (x,y), follow the chain of connected local maxima, in both directions, as long as I(x,y) > \(T_l\).
  3. Mark each pixel as it is visited.
- Edge image:
  Create an image with all edge pixels marked in white, and all non-edges in black.
Test your alogrithm on images of your choosing, experimenting with different values of the parameters sigma (the width of the Gaussian used for smoothing), \(T_h\) (the "high" threshold), and \(T_l\) (the "low" threshold). Also run your algorithm on the following images:
- mandrill.jpg: Try different parameter values.
- csbldg.jpg: Try to find values that will find just the outline of the building, and others that will find edges between individual bricks.

Submitting

This assignment is due Thursday, October 5, 2017 at 11:59 PM. Please see the general notes on submitting your assignments, as well as the late policy and the collaboration policy.

Please submit a single .zip file containing:

A README.pdf containing answers to all written questions, a description of your experiments with different parameters, and any relevant implementation notes. Feel free to either (1) use Latex to type up the math or (2) write it up by hand and paste clearly readable picture(s) into the pdf.
Your edge detection code. This should be the six provided .m files, where all TODOs have been filled in with your implementations.

The Dropbox link to submit the assignment is here.

Note that programming in Matlab is not an excuse to write unreadable code. You are expected to use good programming style, including meaningful variable names, a comment or three describing what the code is doing, etc. Also, all images created by your code must be saved with the "imwrite" function - do not submit screen captures of the image window.

Credit to Fei-Fei Li and Juan Carlos Niebles for several problems.

Last update 23-Jan-2018 10:17:14