We have packaged CIL as an application cilly that contains certain example modules, such as logwrites.ml (a module that instruments code to print the addresses of memory locations being written). Normally, you write another module like that, add command-line options and an invocation of your module in src/main.ml. Once you compile CIL you will obtain the file obj/cilly.asm.exe.
We wrote a driver for this executable that makes it easy to invoke your analysis on existing C code with very little manual intervention. This driver is bin/cilly and is quite powerful. Note that the cilly script is configured during installation with the path where CIL resides. This means that you can move it to any place you want.
A simple use of the driver is:
bin/cilly --save-temps -D HAPPY_MOOD -I myincludes hello.c -o hello
--save-temps tells CIL to save the resulting output files in the current directory. Otherwise, they’ll be put in /tmp and deleted automatically. Not that this is the only CIL-specific flag in the list – the other flags use gcc’s syntax.
This performs the following actions:
Note that cilly behaves like the gcc compiler. This makes it easy to use it with existing Makefiles:
make CC="bin/cilly" LD="bin/cilly"
cilly can also behave as the Microsoft Visual C compiler, if the first argument is --mode=MSVC:
bin/cilly --mode=MSVC /D HAPPY_MOOD /I myincludes hello.c /Fe hello.exe
(This in turn will pass a --MSVC flag to the underlying cilly.asm process which will make it understand the Microsoft Visual C extensions)
cilly can also behave as the archiver ar, if it is passed an argument --mode=AR. Note that only the cr mode is supported (create a new archive and replace all files in there). Therefore the previous version of the archive is lost. You will also need to remove any other commands that operate on the generated library (e.g. ranlib, lorder), as the .a file is no longer an actual binary library.
Furthermore, cilly allows you to pass some arguments on to the underlying cilly.asm process. As a general rule all arguments that start with -- and that cilly itself does not process, are passed on. For example,
bin/cilly --dologwrites -D HAPPY_MOOD -I myincludes hello.c -o hello.exe
will produce a file hello.cil.c that prints all the memory addresses written by the application.
The most powerful feature of cilly is that it can collect all the sources in your project, merge them into one file and then apply CIL. This makes it a breeze to do whole-program analysis and transformation. All you have to do is to pass the --merge flag to cilly:
make CC="bin/cilly --save-temps --dologwrites --merge"
You can even leave some files untouched:
make CC="bin/cilly --save-temps --dologwrites --merge --leavealone=foo --leavealone=bar"
This will merge all the files except those with the basename foo and bar. Those files will be compiled as usual and then linked in at the very end.
The sequence of actions performed by cilly depends on whether merging is turned on or not:
Note that files that you specify with --leavealone are not merged and never presented to CIL. They are compiled as usual and then are linked in at the end.
And a final feature of cilly is that it can substitute copies of the system’s include files:
make CC="bin/cilly --includedir=myinclude"
This will force the preprocessor to use the file myinclude/xxx/stdio.h (if it exists) whenever it encounters #include <stdio.h>. The xxx is a string that identifies the compiler version you are using. This modified include files should be produced with the patcher script (see Section 14).
Among the options for the cilly you can put anything that can normally go in the command line of the compiler that cilly is impersonating. cilly will do its best to pass those options along to the appropriate subprocess. In addition, the following options are supported (a complete and up-to-date list can always be obtained by running cilly --help):
All of the options that start with -- and are not understood by cilly are passed on to cilly.asm. cilly also passes along to cilly.asm flags such as --MSVC that both need to know about. The following options are supported. Many of these flags also have corresponding “--no*” versions if you need to go back to the default, as in “--nowarnall”.
General Options:
If available, CIL uses the x86 performance counters for these stats. This is very precise, but results in “wall-clock time.” To report only user-mode time, find the call to Stats.reset in main.ml, and change it to Stats.reset Stats.SoftwareTimer.
Lowering Options
Output Options:
Selected features. See Section 8 for more information.
For an up-to-date list of available options, run cilly.asm --help.
All of the cilly.asm options described above can be set programmatically – see src/ciloptions.ml or the individual extensions to see how. Some options should be set before parsing to be effective.
Additionally, a few CIL options have no command-line flag and can only be set programmatically. These options may be useful for certain analyses:
When false, all casts in the program are made explicit using the CastE expression. Accordingly, the destination of a Call instruction will always have the same type as the function’s return type.
If true, the destination type of a Call may differ from the return type, so there is an implicit cast. This is useful for analyses involving malloc. Without this option, CIL converts “T* x = malloc(n);” into “void* tmp = malloc(n); T* x = (T*)tmp;”. If you don’t need all casts to be made explicit, you can set Cabs2cil.doCollapseCallCast to true so that CIL won’t insert a temporary and you can more easily determine the allocation type from calls to malloc.
The --envmachine option tells CIL to get its machine model from the CIL_MACHINE environment variable, rather than use the model compiled in to CIL itself. This is necessary when using CIL as part of a cross-compilation setup, and to handle gcc’s -m32 and -m64 which select between a 32-bit and 64-bit machine model.
CIL_MACHINE is a space-separated list of key=value pairs. Unknown keys are ignored. The following keys must be defined:
Key | Value |
bool | sizeof(_Bool),alignof(_Bool) |
short | sizeof(short),alignof(short) |
int | sizeof(int),alignof(int) |
long | sizeof(long),alignof(long) |
long_long | sizeof(long long),alignof(long long) |
float | sizeof(float),alignof(float) |
double | sizeof(double),alignof(double) |
long_double | sizeof(long double),alignof(long double) |
pointer | sizeof(all pointers),alignof(all pointers) |
enum | sizeof(all enums),alignof(all enums) |
fun | sizeof(all fn ptrs),alignof(all fn ptrs) |
alignof_string | alignof(all string constants) |
max_alignment | maximum alignment for any type |
size_t | the definition of size_t |
wchar_t | the definition of wchar_t |
char_signed | true if char is signed |
big_endian | true for big-endian machine models |
const_string_literals | true if string constants are const |
__thread_is_keyword | true if __thread is a keyword |
__builtin_va_list | true if __builtin_va_list is a builtin type |
underscore_name | true if generated symbols preceded by _ |
Some notes:
As an example, here’s the CIL_MACHINE value for 64-bit targets on an x86-based Mac OS X machine:
short=2,2 int=4,4 long=8,8 long_long=8,8 pointer=8,8 enum=4,4 float=4,4 double=8,8 long_double=16,16 void=1 bool=1,1 fun=1,1 alignof_string=1 max_alignment=16 size_t=unsigned_long wchar_t=int char_signed=true const_string_literals=true big_endian=false __thread_is_keyword=true __builtin_va_list=true underscore_name=true