Project contact: Daniel Wilkerson, dsw@cs We have a C++ front end, Elsa, which was written by and is maintained by Scott McPeak with a little help from me. I also wrote a Build Interceptor which when installed on your machine intercepts your gcc toolchain such that when you have build a project the .i files that went into the compiler end up in unused sections in the ELF binary. We have used Build Interceptor to get most of the .i files for Debian. Elsa is quite good: 200 million (yes million) lines of of the 260 million lines of code (before preprocessing) in Debian goes through Build Interceptor and then Elsa end-to-end. Of code that does not, it is often due to it not going through our Build Interceptor instead, or of us simply disqualifying it because of inline assembly; neither of these are failures of Elsa. I have written a whole-program dataflow analysis client of Elsa called Cqual++. It uses the new polymorphic version of the Cqual qualifier analysis of Cqual to provide a C++ version of Cqual: Cqual++. It lives in a meta-analysis project called Oink, which is actually intended to be and is structured as a whole framework for writing analyses for Elsa; Cqual++ is just the first tool in that framework. The tools of Oink are expected to be designed so that different analyses can use each other's results so that we will get a multiplying effect of possible analyses as tools are added. I maintain Oink and can facilitate that. The proposal is Mops++, a Mops that can analyze C++, and making this another tool in the Oink framework. This could be done by just taking Hao Chen's Java implementation of Mops and porting it to C++ (which should be very easy) or alternatively reimplement it, whichever the student found easier. Just the fact that there is an implementation to look at and tests to re-use should help a lot. Elsa already provides an intra-procedural control flow and I already provide a sample Oink tool, cfgprint, that uses it (this is non-trivial due to Scott's optimized design) and prints the CFG out as a graph in DOT format (which dot will convert to PostScript); thus figuring out how get at the CFG is also done and implemented. The only work really is to 1) write the dumbest points-to analysis in the world for virtual functions and function pointers. This is very much like a combination of a) my linker (it groups functions by function names and signature) and b) computing what functions overload one another, something we already do in Elsa and you can just ask for. Thus, it is a straightforward implementation project that would teach the implementer about the problem without requiring too much of them in one semester. If it were good enough, I would ship it with the Oink distribution. This would be super cool and a serious contribution to Oink and our efforts in the group to ship industrial strength programming languages tools to eliminate once and for all classes of security bugs. Note that the release tarball of Oink you will find at the oink site is very old and we have something much better that we will ship soon. Any student working on this project would be given direct access to the five CVS repositories required. References: Elsa http://www.cs.berkeley.edu/~smcpeak/elkhound/sources/elsa/ Oink http://www.cs.berkeley.edu/~dsw/oink.html Cqual http://www.cs.umd.edu/~jfoster/cqual/ Mops http://www.cs.berkeley.edu/~daw/mops/ Build Interceptor http://build-interceptor.tigris.org/