This is the last lab. Have fun.
Privacy is much in the news of late, with concerns ranging from identity theft through government surveillance to commercial exploitation of information about our purchases, our interests, our activities, our friends, and everything else. This lab will explore some issues of privacy and access to information.
This is a comparatively open-ended lab, so you may well find ambiguities and fuzzy bits. Don't worry about them, since this is meant to be for exploration rather than precise answers. But feel free to make suggestions so we can improve the lab for next time.
This lab is intended to be more than a Google and Wikipedia exercise; you must cast your net more widely, by using other search engines and other information sources. You will be graded partly on how well you do this, so tell us for each thing what tools you used and how well they worked. Among the alternative search engines you might try are Yahoo, Microsoft Bing, Ask, Yippy, and Mooter. There are also sites that do telephone number lookup or that maintain public records; financial sites like Yahoo finance and Google finance provide access to holdings and insider trading; and of course social networks like Facebook and Twitter can reveal a huge amount about their users. Explore; that's part of the exercise. You might find it more productive to spread the lab over a couple of days so you have time to think about possibilities.
As you go along, we want you to collect your observations and comments in a Word document. You must use this template, lab8.doc, so we have some uniformity among the submissions. Please download this file now and begin to edit it. In the following, when we ask you to "report", we're looking for a reasonably organized but not too long description. The questions in the text are meant to start you thinking, but need not be answered literally. We're not going to grade your writing, but you'll leave a better impression if there aren't too many spelling mistakes, flagrant grammar errors, random formatting, and so on. It's ok to summarize with lists rather than complete sentences, but do try to distill the essence of what you've seen rather than just cutting and pasting.
You can do this lab anywhere. Some threats primarily affect PCs running Windows, but all users have to be suspicious about most things all the time no matter what system they are running.
For this section, you should use at least three search engines, not just Google alone.
Sometimes people state strong opinions forcefully, and the record lives on forever.
|
How much can you learn about someone just by searching online information? For yourself or members of your family or someone else close to you, see how much you can discover about that person online. Examples of the kind of information you might look for include home address, telephone number, age, birthday, education, employment, political contributions, sports and hobbies, organizations and memberships, price of their home, names of other family members (the classic mother's maiden name, for example), activities and interests. Can you find a picture? Was it one that you knew about?
Do you get any information by searching for a phone number or street address or social security number? (Read this Wikipedia article first about why it is a bad idea to search for your own SSN.) Does a phone number or address reveal a family name? Did you find inconsistent information?
Can you find a good picture of your home (or someone else's) with Google Maps or Earth, or Microsoft Maps? Which one of these gives the best image? Can you make out your car or some other possession? How much might the house be worth? See, for example, Zillow. If you visit Zillow, what kinds of addresses does it show you without being asked?
What does your Facebook page reveal about you that you find surprising or worth thinking about?
There's no need to go overboard on this; the goal is definitely not to invade anyone's privacy but to get a sense of the accessibility of ostensibly private information.
|
As we saw in class, the mere act of visiting a web site reveals some information about you. There are a variety of sites that report back to you about what information your visit reveals, or about what vulnerabilities your system appears to have. Visit some of these and see what they tell you. Here are some useful ones; can you find others like them?
To determine your IP address(es) and other identification info on Windows, run ipconfig, with Start / Run / cmd. In the resulting commandline window, type ipconfig /all. On a Mac, System Preferences / Network / Airport / Advanced... / TCP/IP. Are the values consistent with what is seen by the outside?
Make a search for some service or store with several search engines and see how accurately they geo-locate you. Are there major differences in apparent accuracy among Google, Yahoo and Microsoft Bing?
Send mail to yourself at Gmail or similar site. Include in the mail some words or phrases that might trigger a particular kind of advertisement. What advertisements do you see when you read the mail? Do any of the advertisements appear to know your geographical location?
|
We've talked about how cookies can be used to track what web sites you visit, especially "third-party" cookies (that is, cookies that come from someone other than the web site you accessed directly) that aggregate and correlate information about your visits to apparently unrelated sites.
First, how many cookies do you currently have? Record at least the rough count, and whether this is before or after you removed cookies after the lecture about them. The easiest way to find cookies is to use the browser. In Firefox on a Mac, use Preferences and the Privacy tab. In Chrome, Wrench icon / Tools / Under the hood / Content / Privacy Settings / All cookies and site data. In Safari, Preferences / Privacy / Cookies and other website data.
Now remove all cookies. Set your browser preferences to allow all cookies, then visit at least half a dozen major sites (media, sports and e-ecommerce sites are good for this, as are search engines and even universities). How many cookies does a typical visit deposit? Track the cookies that are tracking you: can you see any evidence of linkage, e.g., updated third-party cookies or URLs after visiting independent sites? You might find it easiest to use a browser setting that asks about each cookie individually.
What sites that you visit regularly deposit third-party cookies? Do any contain interesting information instead of just long strings of apparently random letters and numbers? What's the furthest into the future cookie expiration date you can find?
Repeat some of the experiment with a different browser if you can do so without too much work. If you remove the cookies from the first browser, what happens with the second browser?
You might find this paper and this paper interesting reading; they are technical in places but not much beyond what we've done in the class.
|
"Web bugs" are another way to track when someone visits a web site or accesses information using a program that interprets HTML; a web bug is typically an almost invisible 1x1 pixel image that includes a URL, like this one encountered at cnn.com:
<img src="http://cnnglobal.122.2o7.net/b/ss/cnnglobal/1/H.1--NS/0" height="1" width="1" border="0">When the image is retrieved, the server at 2o7.net, a large aggregator owned by Adobe, knows that you have visited the page that contained the img tag. (The Adblock extension removes many advertising images both large and small and thus greatly reduces tracking.)
Find a web page that includes a web bug from a third-party. You can often find candidate links in a web page by searching for things like "height=1" in various forms, and in the Security panels in Firefox. Can you find a web bug in an email message?
|
Flash deposits cookie-like data on your machine as well. Flash cookies are easy though tedious to get rid of, by visiting this site. Before removing them, check what Flash cookies you have. How many sites are represented and how much total space is used?
|
As we discussed in class, there are ways to limit your risks and the amount of information that you reveal. Virus checkers are the most important, but there are plenty of others as well.
Many web sites insist that you provide a working email address before they will let you register or access some service. 10MinuteMail provides a useful alternative: it provides an email address that's valid for 10 minutes and shows you whatever mail arrives during that time; that lets you retrieve the registration key or whatever, without giving away a real address. Try this service. How long does it take for mail to arrive? Empirically, how long does it take for the temporary address to time out?
Check your own environment. What browser do you routinely use? What are your default settings for cookies, filename extensions, Javascript, Java, popups, automatic updates, downloading, software, installation, programs that start automatically, etc.? Does your mail reader provide a previewer that interprets HTML and thus is subject to web bugs? Do you read mail in HTML format by default? Try sending yourself mail with a reference to an image in your public_html directory, i.e., http://www.princeton.edu/~your_netid/your_file, to see whether the image is retrieved and displayed.
What plug-ins and add-ons are already installed in your browser? Among those you might consider adding are Adblock, NoScript, Ghostery, FlashBlock, and CSLite; each reduces your exposure to various kinds of tracking and potentially harmful content.
Install Ghostery, which works in Firefox, Chrome and Safari. This extension detects and disables Javascript trackers, which would otherwise report your page visits and activities to advertising aggregators. The Firefox version also disables a lot of third-party cookies. How many trackers does Ghostery report it is blocking? Visit some sites to see how many trackers are in use. For instance, Princeton has one (Google Analytics); ITWorld.com has 10. What's the highest number you can find?
As we saw in class, Word, Excel and other programs include a Visual Basic interpreter that can be used to silently run programs that are included in documents. (Office 2008 on Macs does not support VB.) What level of macro protection are you running in Word and Excel? (Look under Tools / Macros, or Preferences / Security on Macs with Office 2011.) If you run Internet Explorer, what security level is being applied to ActiveX controls?
Perhaps surprisingly, PDF files can contain Javascript code, which will be interpreted by Adobe Acrobat Reader. Does your version of Acrobat enable Javascript? If so, you can turn it off without much consequence.
What do you reveal on Facebook? Bear in mind that most of this information is readily available outside your own list of friends and networks.
|
Finally, if you saw anything interesting or suspicious that we didn't ask about specifically, or if you have any thoughts on how to improve this lab, we'd like to hear them. There are a couple of wrapup questions in lab8.doc that address this. Thanks.
When you're all done,
Upload lab8.doc or lab8.docx to the CS dropbox for Lab 8.