CS261 Projects
General information
Your term project should address a research issue in
computer security and consist of the design
of some computer security system or technique,
or the analysis and possible improvement of some
existing system or technique. The main goal of the
project is to do original research on a problem of
interest in computer security.
You should work in a small group;
I expect that teams of approximately 2--3
will be appropriate for most projects.
Of course, expectations will
be adjusted according to the number of people in
your group.
I will not categorically rule out solo teams,
but I expect that working in groups will allow you
to tackle more substantial research issues.
If you have trouble finding a project
partner, I can help you get matched up with someone
else by maintaining a list of people seeking teammates.
Projects will be evaluated on the quality of their research
in computer security. At the end of the semester,
you will write a conference-style paper on your work.
See below for more details.
I expect that most projects will fall (more or less)
into one of two categories:
- Design. Design projects will usually attempt to solve
some interesting problem by proposing a design; implementing a prototype;
and using the implementation as a basis for evaluating the proposed
system architecture.
- Analysis. Analysis projects might, for example, study
some previously-proposed implementation technique,
existing system, or class of systems;
evaluate its security properties; find flaws, or strengths, in it;
and provide new insight into how to
build secure systems.
The research should be relevant to computer security,
but this will be interpreted broadly.
You are encouraged to find topics of interest to you;
feel free to be creative in selecting a project topic.
You're welcome to pick a topic that is connected to your
current research: for instance, if your primary research interest
is in digital libraries, you would be welcome to do a class project
on some aspect of security, cryptography, or privacy in digital libraries.
If you're at a loss for a project topic,
I've prepared a list of possible project topics that you
can peruse as examples of how to a pick a suitable project.
See below.
But don't feel limited to these suggestions!
They are intended only as examples.
You're welcome to come discuss possible project ideas with me,
if you like. I'm happy to make myself available to discuss projects.
A final suggestion: Aim high!
The top projects could lead to publication
(and in past years, a number of projects
have led to publications).
The process
You will write a concise (approximately 1 page)
project proposal that should clearly state the problem
you will be solving, the key challenges for new research,
your plan of attack (including milestones and dates).
If there are any special resources you might need
from me, mention this as well.
The project proposal was due Tue 15 October.
Here's how to submit your proposal.
You should put together a web page for your project;
currently all it needs to contain is the project members,
title of your project, and proposal.
Then just email the URL for your project web page to
daw@cs.berkeley.edu
by Tuesday, October 15th.
In mid-November I might ask you to write a concise status report
so I can make sure the projects are on-track.
I am always available to meet with any groups who would
like to discuss their project, request additional resources, or
ask for advice.
You will also be required to present your project
at a poster session, to be held on Monday, December 9th from 2-4pm
in the Woz lounge.
Finally, a project report will be due by 9am on Monday, December 16th.
No exceptions or extensions will be granted, so get it in on time!
See below for instructions.
The final report
You are expected to write a technical paper, in the style
of a conference submission, on the research you have done.
State the problem you're solving, motivate why it is an
important or interesting problem, present your research
thoroughly and clearly, compare to any related work that
may exist, summarize your research contributions,
and draw whatever conclusions may be appropriate.
There is no page limit (either minimum or maximum),
and reports will be evaluated on
technical content (not on length), but I expect
most project reports will probably be between
7--15 pages long.
If you are not familiar with writing conference-style papers
in computer science (or even if you are), the following resources may help:
You may submit your project report electronically or on paper.
I prefer electronic submission, although you may choose either.
In either case, the deadline is the same:
Monday, December 16, before 9:00am.
If you submit electronically:
- It must be in a format which is easily readable on Unix platforms:
that means HTML, Postscript, or PDF is fine (but not Microsoft Word).
- Place a link to the file on your project web page (see
here for the list of project web pages),
and send me email with the URL. I will send you confirmation of receipt.
If you submit on paper, place it in my mailbox in Soda Hall
(in the mailroom, or outside my office -- 765 Soda).
Example project topics
Note: Some examples are very specific.
Others are quite generic; for the generic suggestions,
be sure to narrow down the topic substantially and propose
something concrete and focused.
If you are interested in any of the project topics below,
please talk to me about it; I can make some more concrete suggestions.
- New attacks
- Find new security weaknesses in any widely-deployed system.
- Security auditing
- Audit a widely-used and under-scrutinized open-source package
that is security-critical. Report on your experiences and lessons.
How would you re-structure/re-implement the system to make it more
robust? What tools would have made your auditing task easier?
How effective are existing tools?
- Tools for vulnerability detection
- Study ways to build tools to help automate the process of
reviewing security-critical source code. Can you use runtime
testing, static analysis, model checking, formal verification,
or other techniques to detect any interesting classes of common
security holes? I can give you some concrete ideas to get you
started, if this interests you.
- Resilient aggregation
- Build "resilient aggregation" components for TinyOS.
Sensor networks (such as the TinyOS-based systems studied
here at Berkeley) are often used to aggregate sensor information
and use the result to control systems. For instance, we might
put a sensor in each room of Soda Hall and use the average temperature
to control the air conditioning. However, at
present, our aggregation operations are not secure against
errors or maliciously chosen inputs (as might happen
if a sensor is compromised). For instance, consider computing the
average of N data values: an adversary who can control one of those
inputs can bias the output by any desired amount, hence the "average"
is not a resilient aggregation operation. However, the median is
resilient, because changing any one input by any amount will cause
only a small change in the output. Build a prototype of a few
resilient aggregation operators and study their usefulness in some
sensor network application.
- Adversarial simulation
- Build an "adversarial simulator" for TinyOS. Currently, the
TOSSIM simulator runs the system under friendly conditions (no
dropped packets, no bit errors, no unfriendly timing of the scheduler).
Hence, it can't detect ways that an adversary might try to crash
or subvert the system.
We might be more likely to find security bugs by running the system
on parameters that are chosen to be the worst case for system, or
that are likely to cover corner cases. For instance, an adversarial
simulator might determine ahead of time all possible dependencies
between tasks and then test exhaustively all possible orderings of
interdependent tasks, looking to see if any of these orderings deadlocks
the system. Build a prototype of an adversarial simulator
and study its effectiveness.
- Preventing casting bugs
- Implicit casts are a non-trivial source of security holes
in C programs. For instance, the following code is vulnerable:
typedef unsigned short uid_t;
void dowork(uid_t u);
main() {
int x = read_from_network();
// Squish root (it's not safe to execute dowork() with uid 0)
if (x==0)
exit(1);
dowork(x);
}
Notice that the check for root is buggy, due to the implicit cast
to a 16-bit type in dowork(x): by sending the value 65536
on the network, we can execute dowork(0). There have been
vulnerabilities of this form in NFS servers.
There have also been vulnerabilities due to
overloading of uid -1 (on some systems, uid 65535 was the nobody
account, but the set*uid() calls treated -1 as meaning
"don't change this uid").
There have also been many security holes due to implicit
casts between signed and unsigned types.
Is there anything intelligent and cost-effective
we can do about any of these risks?
- Validation bugs in Linux kernel
- Dawson Engler's group found many security holes in the Linux
kernel based on improper input validation (see
their paper)
based on heuristic compile-time rules. Formalize their rules into
a concrete type system, perhaps using
flow-sensitive type qualifiers.
Then, analyze the Linux source code using your type system, perhaps using
Cqual for the
type-checking.
- Inlined reference monitors
- Proof-carrying code takes a reference monitor (expressing some
security policy) and injects it into the program during compilation,
in an integrated way; then it is possible for the
recipient to verify that the desired policy is enforced by the
presented code.
This allows us to build extremely efficient reference monitors:
rather than implementing the reference monitor in a separate process
(thereby incurring performance penalties), the reference monitor can
be inlined directly into the code that it is supposed to monitor.
Sometimes we don't care about the ability of a recipient to verify
that code injection was done appropriately, for instance because
the injector and receiver are one and the same.
Can we build a more lightweight implementation of this functionality?
For instance, maybe we can emulate some notion of "user-level call gates"
using the mprotect() system call?
(We could imagine storing the reference monitor's state in a
write-protected segment of memory; then any call to an interface
that is protected by the reference monitor would be replaced with
a call to the reference monitor entry point; and the entry point
would need to write-enable the special segment and atomically transfer
control to the reference monitor code.)
- Virtual machines for security
- Recently, software has become available to implement a virtual
machine for modern operating systems (e.g., Windows). This seems to
provide a powerful mechanism for executing dangerous actions in an
isolated environment. Does this idea work, and if so, how can we
best take advantage of virtual machine techniques? Can we evaluate
the security of, say, the VMWare virtual machine against malicious
attempst to harm the host OS? Is there any way
to structure the the virtual machine implementation to isolate the
security-critical functionality and thereby make the TCB
simpler and easier-to-verify?
- Enforcing resource bounds on malicious code
- Can we use proof-carrying code techniques to ensure that
malicious code never exceeds a fixed resource bound?
For instance, we might insist that it terminate within a given
number of clock cycles;
we could insert checks to a global timer whereever we cannot prove
a satisfactory upper bound on the running time of the program, and
omit the checks in regions (e.g., acyclic control-flow graphs) where
we can verify statically that the time bound will not be exceeded.
- Side channel attacks meet mobile code
- Typically the easiest way to break a cryptosystem is not by
directly attacking the mathematics but by "cheating" (e.g., bypassing
the crypto entirely), and one way to do this is to use side channel
attacks. Java and other forms of mobile code give attackers a
convenient way to run code on targetted machines, which might give
the attacker a way to measure timings or memory operations by
observing scheduling or swapping decisions. The goal of the project
would be to investigate whether it is possible to mount side channel
attacks within the constraints imposed by Java or some other
widely-deployed mobile code system.
- Verifiable distributed computation
- The Internet is a vast resource of idle machines; we
might like to harness these spare CPU cycles by offloading our
lengthy computations to other computers.
But in any such distributed setting, how do we know that the
result that comes back is the correct one we wanted?
Careful engineering combined with some recently-proposed
cryptographic techniques might go a long way here in solving
some cases of interest.
- Security of peer-to-peer systems
- Peer-to-peer systems (e.g., Gnutella, Kazaa) have been a hot
topic recently. You might study the security challenges inherent in
peer-to-peer systems, either by proposing techniques for building
secure peer-to-peer systems, or by analyzing an existing peer-to-peer
system.
- Information retrieval for audit logs
- Suppose you have audit logs of, say, network events.
How would you design a search engine so you could retrieve
security-relevant events after the fact? What network events
would you want to have logged, and in what format?
For instance, imagine tomorrow CERT announces a new attack,
and CS admins discover that one of their machines has been
broken into. It would be nice if we could just search all
audit logs over the past year to see if any other CS machines
had been broken into using the same attack. What can you do
in this area?
- Formal modelling of security systems
- Build a formal model of some aspect of a security system,
and rigorously evaluate its properties.
For instance, you might look at the state machine associated
with a TCP/IP stack, and model how the various network events can affect
the state. You could build a formal model of actual behavior
by working from the OS source code or by exhaustively testing the
possibilities.
Then you might build a formal model of intended behavior -- e.g.,
by working from the RFC, or by formalizing that there should be no
LAND attacks (for instance) -- and you would check whether the
specification matches the verification.
Or, you might build a second model from a second operating system,
compare where their behavior differs, and study whether this has
any consequences for how to write portable security code.
Such models might also be useful for intrusion detection as well.
- Secure coding
- Improve (somehow) the state of the art in implementation of
security-critical projects.
You might explore the relevance of various techniques from
software engineering or programming languages, for instance.
- Theory or cryptographic work
- Projects in theory or cryptography are not necessarily
out of scope, if you have some specific ideas you consider relevant.
For instance, formal analysis of cryptographic protocols or
cryptographic primitives could probably serve as a reasonable
topic.
- Privilege separation
- We saw in class a number of mechanisms for building a sandbox
to prevent untrusted code from affecting the rest of the system in any way.
This is a useful primitive, but in practice we often want to allow
some amount of controlled sharing or limited interaction.
You might study how to securely allow this limited interaction in
some application context of interest, such as dividing programs into
privileged and unprivileged pieces.
The LSM project for Linux might be of interest here.
- Distributed firewalls
- Imagine taking the enforcement mechanism found on a firewall
and replicating it on all inside machines. (This would require
installation of new software on all internal machines, but suppose
we can handle that administrative burden.) How do we maintain
centralized control over security policy? How do we specify policies?
How do we maximize assurance in such an environment? How do we
handle multi-organization scenarios, where a machine is a member
of multiple organizations and thus multiple parties would like
to add security restrictions?
- Encrypted databases
- Suppose we want to store our data on a remote server
(e.g., so that we can take advantage of the computational power of
the server) without requiring full trust in the remote server.
What types of database semantics can we support?
In previous work,
Dawn Song, Adrian Perrig, and I showed a special way to store the data in
encrypted form on the untrusted server so that the trusted server can do
keyword searches over the encrypted data (on our behalf, and only when
authorized by us) with minimal communication complexity.
What other types of queries can be supported efficiently?
How about if the goal is integrity rather than confidentiality?
This is a cryptographic design problem.