|
COS 429 - Computer Vision
|
Fall 2017
|
Assignment 0: Setup
Nothing to turn in -- but getting this done early will make your life happier later in the course.
September 21st is a good target due date.
1. Getting familiar with Matlab
Princeton has a site license of Matlab, and you should
install it on your own machine - instructions are
here.
Read through the following for a basic introduction to Matlab:
Work through the following tasks using an image of your choice:
- Read an image into a variable.
Hint #1: "help imread"
Hint #2: Use single quotes around the filename.
Hint #3: Ending a command with a semicolon supresses printing the result.
- Display the image.
Hint: "help imshow"
- Convert the image to grayscale.
Hint: "help rgb2gray"
Note: if your version of Matlab doesn't have the
rgb2gray function, download rgb2gray.m.
Place this in your working directory, and it should be auto-loaded by Matlab.
- Convert the grayscale image to floating point.
Hint #1: "help im2double", and be aware of the difference between im2double(img) and double(img).
Hint #2: imshow is able to also display floating-point images.
- Plot the intensities along one row of the grayscale image.
Hint #1: Extracting a part of a matrix is done by
matrix2 = matrix1(row_min:row_max,col_min:col_max);
The indices are inclusive: array(low,high) returns the set [low, high].
row_max or col_max may also be "end" to
indicate the last element.
Just a ":" is equivalent to "1:end".
Hint #2:
BIG WARNING: indices in Matlab are 1-based
(not 0-based as in C or Java).
Hint #3: "help plot"
- Store the width and height of the image in variables "width" and "height".
Hint #1: "help size"
Hint #2: Functions in Matlab may return multiple values. You can
get at the values using the notation
[var1, var2] = func(x)
Hint #3: In Matlab, the number of rows is the first dimension and
the number of columns is the second. In terms of an x,y coordinate scheme,
(row, col) indexing means images have shape [height, width] and can be indexed as image(y, x).
- Write a pair of nested "for" loops to set a grid of every 10th pixel
horizontally and every 20th pixel vertically to 0.
Hint #1: "help for"
Hint #2: "start:increment:stop"
- Create a function "maxrow" that takes a matrix and a row index and returns
the brightest pixel in the given row. Store the function in a file called
"maxrow.m" so that Matlab loads it automatically when you call the function.
Hint #1: "help function"
Hint #2: "help max". Matlab has many built-in functions
that operate on entire vectors or matrices, and using those is usually
much, much more efficient than writing a "for" loop.
- Flip an image vertically. Then show the original and the flipped image side-by-side.
Hint: "help subplot"
- Write the modified image back to a new file.
Hint #1: "help imwrite"
Hint #2: For RGB images imwrite supports both uint8 and floating point pixels. For floating point images, the valid range of values is [0.0, 1.0].
If you get stuck on any of these, ask for help on piazza.
2. Getting MatConvNet up and running
In the later parts of the course, we will be studying deep learning applied to computer vision, specifically looking at Convolutional Neural Networks. There are several open-source Convolution Neural Network packages
available, including TensorFlow, Torch, Caffe, Theano, and MatConvNet. Of all these, we chose to use MatConvNet for the course, it is (1) the easiest to install, (2) the easiest to understand and (3) the easiest to make simple modifications to. That being said, the models and algorithms are still fairly complex, making the codebase potentially time-consuming to set up. Thus it might be a good idea to get started early and make sure you're ready to go once the deep learning assignments roll around.
- Follow the MatConvNet installation instructions. Feel free to skip the DAG models section. In this class we will be training only smaller networks where the CPU-only implementation should be sufficient. If you happen to have an NVIDIA GPU in
your machine and have the CUDA development libraries installed, feel free to set up the GPU training as well. When done you should see:
- Run the classification network on other images. Where does it work surprisingly well? Where does it make mistakes?
Last update
23-Jan-2018 10:16:48