CS 261 Homework 2
Instructions
This problem set is due Friday, October 7, at 5:59pm.
Work on your own for this homework.
You may use any source you like (including other papers or textbooks),
but if you use any source not discussed in class,
you must cite it.
You have three options on this homework.
Select one option, and solve that one.
Submit your homework solution by emailing it to
cs261hw2@taverner.cs.berkeley.edu.
Option 1: Analyze three HW1 submissions
You've been assigned three htmlfilter implementations
that were submitted to me in HW1.
Your goal: assess whether they implementations meet the
security goals set out in HW1.
(You do not need to review how well they meet the functionality requirements.)
To begin, I will email you
your assigned implementations.
Implementations are identified by a two-digit
code (e.g., 17.tar); I will assign you three of those implementations.
Download those three implementations from this directory.
Critique the design and implementation of all three.
- 1.1:
What is the two-digit ID number of your first assigned implementation?
What are its main security weaknesses? Or, if you found none,
what are the best features of its design/implementation?
- 1.2:
What is the two-digit ID number of your second assigned implementation?
What are its main security weaknesses? Or, if you found none,
what are the best features of its design/implementation?
- 1.3:
What is the two-digit ID number of your third assigned implementation?
What are its main security weaknesses? Or, if you found none,
what are the best features of its design/implementation?
- 1.4:
If you were forced to choose between these three implementations,
which one would you judge to be most likely to meet its security goals?
Why?
Promise: Your answers on this homework will not affect the grades
of anyone else. You can feel free to critique an implementation honestly
and frankly without fearing that your comments will have any negative
effect on that person's grade. Grades for HW1 will have already been assigned
by the time I see your solution.
I will not show your evaluation
to the authors of your assigned implementations.
Update (10/7): Please do not use the solutions to HW1
or its testbed as a substitute for your own analysis.
Option 2: Solve some thought problems
Question 2.1
Suppose I come up with a super-sekrit ultra-c00l new attack on
browsers. In particular, I have a special URL, and if I can get your
browser to visit that URL, then you are totally owned: I can take
control of your browser. List three (or more) ways I
could cost-effectively get a large number of users to follow a link to
my special URL.
Question 2.2
One of your officemates, Bob, keeps playing pranks on you.
You decide to return the favor and play a prank on him.
You've hatched a plan to spread a rumor that Bob's advisor
has decided to quit academia and go to work in the Peace Corps.
Knowing Bob, you're pretty sure that when Bob hears this rumor,
at first he is going to dismiss it as not believable -- but if when
Bob next visits his advisor's web page, the web page includes a message
announcing his advisor's impending retirement, Bob is gonna freak out.
Describe a method you could use which would probably be
successful in causing Bob to freak out.
(Do make sure to apologize to Bob afterwards and buy
him a good dinner in recompense!)
Question 2.3
This question asks you to explore some of the consequences
of active networks, where packets
can contain mobile code that is executed by the routers along the
path.
For concreteness, we can think of "adaptive routing"
as a sample application. If your TCP connection to France is too slow
because of poor bandwidth on the transatlantic link and for some reason
you happen to know that there is a much faster route to France via
China, you might wish to adaptively update the route your TCP packets
take. In this case, you would "push" some mobile code into each
router along the way. The mobile code would run at each router before
the packet is forwarded and select which interface to send it out over.
We describe below a series of extensions to the IP protocol suite
which allows for progressively more sophisticated active networks applications.
For each of the four parts below, list the security threats
that might arise for that extension.
The purpose of this question is to study issues that are inherent
in the functionality; you may ignore the risk of implementation bugs such as
buffer overruns.
- In the simplest variant, we'd extend the IP packet format to allow
an optional extra header which contains some mobile code to run at each
router. The mobile code is specified using Native Client (NaCl) object code:
i.e., in compiled code that has been SFI'ed according to the NaCl scheme.
Each router which receives such a packet first verifies
that the mobile code has been correctly sandboxed using the NaCl SFI rules,
but with one additional restriction:
the code must contain no backwards jumps or indirect jumps and no
function calls or returns (the only branches allowed are a
forward branch to a fixed address).
The router sets up the NaCl code in memory and copies into its data
region (1) a copy of the entire packet, and (2) a global list of
interfaces available at the router.
(Each interface in the list is annotated with a little bit
of relevant information that can be read by the NaCl code, such as the IP
address of the next hop along that interface.)
The router then executes the NaCl code.
Just before exiting, the NaCl code should store the name of the desired
outbound interface at a fixed location in its data region.
The router will forward the packet out via that
interface on towards its destination.
- One obvious performance issue with the previous scheme is that
it requires an overhead of potentially hundreds of bytes of code
in every packet.
So we introduce the notion of "flows" to amortize the cost of specifying
the mobile code.
Each packet is associated with a flow.
In TCP, the flow ID might be the (src host, dst host, src port, dst port)
tuple. For other protocols, we might simply extend the packet format to allow
for a 32-bit flow ID.
We add a "set handler" IP option which allows endpoints to specify
a single chunk of mobile code which will be run at the router
every time a packet is received on the same flow.
Thus one endpoint can send a packet with the "set handler" IP option
and containing a lengthy chunk of mobile code; that mobile code will
then be applied to all subsequent packets on that flow, and does not need
to be sent again.
This allows us to specify a chunk of mobile code once; then all
subsequent packets in the flow
will inherit the same code without incurring any bandwidth overhead.
- It occurs to us that we might like to allow the mobile code to make routing
policy decisions based on the payload of the packets, or even to compress
packets for us on the fly when bandwidth is scarce.
Since this might require scanning the entire packet and possibly
interpreting higher-level protocols, we will need to be able to write
loops in bytecode.
Therefore, we eliminate the restriction on backwards jumps, and allow
arbitrary control flow in the bytecode (subject to the NaCl SFI rules).
To implement compression, the mobile code can modify its copy of the
packet.
When the mobile code exits, the router will read the
(possibly modified) packet from the mobile code's data region, and
then use that in place of the packet it originally received:
i.e., if the mobile code modifies its copy of the packet, then the
router will forward the modified packet, not the original packet.
Also, we allow mobile code to maintain state across packet reception events.
Thus, when mobile code exits after processing one packet, the router makes
a copy of the entire data region, and restores this copy before executing
the mobile code for the next packet in the flow.
- An astute reader points out that decompression may increase the
size of a packet. If this exceeds the network's MTU, mobile code that
receives a compressed packet may need to send multiple packets containing
decompressed data.
Therefore, we extend the scheme so that
mobile code can construct whole IP packets in their own memory
space and invoke a special operation to send those packets over the wire.
Option 3: Do a security review of some software
For this question, you will practice doing a security evaluation of
some interesting program.
Pick any open-source application where security is relevant.
The only requirement is that it must consist of at least 2000 lines of code.
If you're at a loss for a program to audit, good choices might include network
daemons, standard Unix utilities, setuid programs, clients that
process data that came from the network
(e.g., MP3 players, image viewers, etc.),
or any web application.
Or, you could look on
Google Code project hosting
for a software program to audit.
Try not to pick an application that is too large or complex;
a smaller, well-defined piece of software will make your life easier
for this problem.
Then, do all of the following.
- 3.1: What application did you pick? Which version? Give the URL where
I can download the source. How many lines of code is it?
(One simple way to count the number of lines is using the
sloccount program.
For C programs, another way is to run
"find . -name '*.[ch]' -print | xargs wc -l"
from within the source directory.)
Include all of this in your write-up.
- 3.2: Spend 1-2 hours familiarizing yourself with it: e.g.,
how to run it, what it does, reading documentation.
Then, based upon your understanding of the program,
describe the threat model and security goals for your program
in your write-up.
- 3.3: Spend 1-2 hours to understand the architecture and
organization of your program. (You may have to browse source code
for this, because it is often not documented.) Draw a diagram depicting
the program architecture at a high level. Your diagram might show one
or more important components of the program and how they interact.
It should also show any untrusted external entities or data that the
program interacts with, any network communication channels it uses,
and any potentially sensitive or untrusted files (or other data
containers) that it stores or reads on persistent storage.
Your diagram does not need to be complete; do your best to identify
the most interesting or security-relevant parts, stopping after 1-2 hours.
Include your diagram with the write-up you turn in to this homework.
- 3.4: Based on the diagram you drew and your examination so far,
what portions of the program seem
most likely to have the highest risk of security holes?
(If you want some tips on how to build a diagram and identify
security risks from it, you could check out Microsoft's
Introduction to threat modelling
or the STRIDE
methodology.)
- 3.5: Select one high-risk portion of the code (from your
analysis in 3.4). Ideally, you'd choose a subset of at most a few hundred
lines of code of the program.
Spend 1-2 hours reading the code from this portion of the program,
looking to see whether the program meets its security goals.
This should include an intensive read of the source code, looking
for common implementation errors. You don't have to audit all the
high-risk code; instead, stop after 1-2 hours, and describe what you
did review.
- 3.6: Write a summary of the results of your audit.
Did you find any security holes or fishy-looking code? Could the code
have been structured better for security? Describe your findings.
Attach a copy of your summary to your write-up for the homework.
(To get the idea of what a summary for 3.6 might look like, you can refer to
a report
I wrote a while back, when reviewing a piece of code for fun.)
- 3.7: If you found any security holes, notify the code maintainer as well.
Attach to your write-up a copy of your bug report you sent to the code
maintainer and (if available) a URL in their bug tracker where I can
view the report.
Turn in the entire write-up.