CS 261 Homework 2

Instructions

This problem set is due Monday November 9, at 11:59pm.

You can work together on this homework with others if you want, but your writeup must be your own. You may use any source you like (including other papers or textbooks), but if you use any source not discussed in class, you must cite it.

Clarification (added 11/3): You can submit the homework by emailing it to cs261hw2@taverner.cs.berkeley.edu.

Question 1

Suppose I come up with a super-sekrit ultra-c00l new attack on browsers. In particular, I have a special URL, and if I can get your browser to visit that URL, then you are totally owned: I can take control of your browser and your account. List three or four ways I could cost-effectively get a large number of users to follow a link to my special URL.

Question 2

Next week Google comes up with a clever way to enable users to log into third-party web sites using their Google account, without needing a separate account or a password for the third-party site. Here's an example illustrating how their scheme works. In this example, third-party site Flyertalk decides to take advantage of this scheme to enable Google users to log into Flyertalk using their Google account and Google password.

Initially, Google publishes their public key. Assume Flyertalk has a way to get a copy of Google's public key through some trusted out-of-band channel.
When I created my account on Google, Google picks a secret 128-bit authenticator (a random 128-bit value unique to my account and unguessable by anyone), and stores it in a database associated with my account. Google also sets a secure, persistent cookie in my browser holding this 128-bit authenticator.
Background: A secure cookie is one with a special flag set so that it will only be returned back to Google over https (it is never sent over any http connection). A persistent cookie is one with a special flag set so that it never expires, and is retained by my browser even if I close the browser and start it up again.
Flyertalk adds a login button for Google users to their home page. The button links to https://google.com/redir?to=http://flyertalk.com/arrival.
If I click on the link, my browser loads that URL from Google, sending my persistent cookie along with it. Google's web server looks up the 128-bit value found in that cookie in its user database to find my username. If it finds a match, Google has authenticated me and the protocol continues, using steps 5-8 to dynamically generate a HTML page to send me. If Google finds no match, then Google serves me an error page and the protocol stops. In this example, Google will find a match and retrieve my Google username, daw.
Google forms an identifier that includes my Google username (daw) and the third-party site (flyertalk.com), concatenating these two values with a comma as separator. In this example, the identifier is daw,flyertalk.com.
Google signs the identifier using a public-key signature algorithm and its private key. Google forms a token that contains the identifier and its signature on the identifier (in hex). In this example, the token might be daw,flyertalk.com,4F92AD55E (except that in practice the signature probably would be a longer string of hex digits).
Google extracts the Flyertalk URL from the ?to= parameter of the URL in step 3. In this example, that URL is http://flyertalk.com/arrival. Google appends ?token= followed by the token to this URL, obtaining http://flyertalk.com/arrival?token=daw,flyertalk.com,4F92AD55E.
Google's web server responds with a HTML page that causes my web browser to redirect to the URL (http://flyertalk.com/arrival?token=daw,flyertalk.com,4F92AD55E) that was computed in the previous step. My browser obligingly loads that URL from Flyertalk.
Flyertalk's web server extracts the identifier and signature in the ?token= parameter of this request and uses Google's public key to check whether the signature is valid. If the signature is valid, Flyertalk extracts my Google username (daw) and the domain name from the identifier and checks that the domain name is as expected (namely, it exactly matches flyertalk.com). If all these checks pass, Flyertalk considers me logged in under the Google username found in the identifier (daw). For instance, Flyertalk might serve me a session cookie and remember that this session is associated with Google username daw.
I am now authenticated to Flyertalk. From my point of view, all I had to do was click a little button; I didn't have to enter in any password, register with Flyertalk, or create an account on the Flyertalk web site. Very convenient!

Your job is to assess the security of Google's scheme.

(a) How well does Google's scheme resist phishing attacks?

(b) How well does Google's scheme resist DNS spoofing attacks, if the attacker controls part or all of DNS?

(c) How well does Google's scheme resist eavesdropping attacks, if the attacker can eavesdrop on my communications? (e.g., I log into Flyertalk over an unencrypted wireless connection to the Internet.)

(d) Why is it important that Google's scheme include the third-party site (flyertalk.com) in the identifier?

(e) Are there any other attacks or security concerns with Google's scheme?

Clarification (added 11/3): In Step 5, Google obtains the name of the third-party site by parsing the content of the to=... field in the URL to extract the domain name of the third-party site.

Question 3

You've been assigned two htmlfilter implementations that were submitted to me in HW1. Your goal: assess whether they implementations meet the security goals set out in HW1. (You do not need to review how well they meet the functionality requirements.)

To begin, I will email you your assigned implementations. Implementations are identified by a two-digit code (e.g., 17.tar); I will assign you two of those implementations. Download those two implementations from this directory. Critique the design and implementation of both.

(a) What is the two-digit ID number of your first assigned implementation? What are its main security weaknesses? Or, if you found none, what are the best features of its design/implementation?

(b) What is the two-digit ID number of your second assigned implementation? What are its main security weaknesses? Or, if you found none, what are the best features of its design/implementation?

(c) If you were forced to choose between these two implementations, which one would you judge to be more likely to meet its security goals? Why?

Promise: Your answers on this homework will not affect the grades of anyone else. You can feel free to critique an implementation honestly and frankly without fearing that your comments will have any negative effect on that person's grade. Grades for HW1 will have already been assigned by the time I see your solution. I will not show your evaluation to the authors of your assigned implementations.