BitBlaze

Binary Analysis for COTS Protection and Malicious Code Defense

[Research Statement and Overview] [Current Projects] [Publications] [Members]

Research Statement and Overview

Binary analysis is imperative for protecting COTS (common off-the-shelf) programs and analyzing and defending against the myriad of malicious code, where source code is unavailable, and the binary may even be obfuscated. Also, binary analysis provides the ground truth about program behavior since computers execute binaries (executables), not source code. However, binary analysis is challenging due to the lack of higher-level semantics. Many higher level techniques are often inadequate for analyzing even benign binaries, let alone potentially malicious binaries. Thus, we need to develop tools and techniques which work at the binary level, can be used for analyzing COTS software, as well as malicious binaries.

The BitBlaze project aims to design and develop a powerful binary analysis platform and employ the platform in order to (1) analyze and develop novel COTS protection and diagnostic mechanisms and (2) analyze, understand, and develop defenses against malicious code. The BitBlaze project also strives to open new application areas of binary analysis, which provides sound and effective solutions to applications beyond software security and malicious code defense, such as protocol reverse engineering and fingerprint generation.

The BitBlaze project consists of two central research directions: (1) the design and development of the underlying BitBlaze Binary Analysis Platform, and (2) applying the BitBlaze Binary Analysis Platform to real security problems. The two research focii drive each other: as new security problems arise, we develop new analysis techniques. Similarly, we develop new analysis techniques in order to better or more efficiently solve known problems. Below, we give an overview of the two research directions.

The BitBlaze Binary Analysis Platform Project

The underlying BitBlaze Binary Analysis Platform features a novel fusion of static and dynamic analysis techniques, mixed concrete and symbolic execution, and whole-system emulation and binary instrumentation. The BitBlaze platform has different components for each task: Vine, TEMU, and Rudder. The three components in tandem provide the power for effective analysis of real-world binary programs for various applications.

Vine, the static analysis component. Vine provides an an intermediate language for assembly (ILA), and an infrastructure for analyzing programs written in this language. ILA is a full language in which programs can be written, type-checked, then compiled down to assembly. We also provide analysis on the ILA, such as abstract interpretation, dependency analysis, and logical analysis via interfaces with theorem provers.
TEMU, the dynamic analysis component. TEMU provides a dynamic analysis environment through whole-system emulation and dynamic binary instrumentation. TEMU is OS-aware (i.e., it understands OS-level semantics) and enables various fine-grained dynamic analysis to build upon, such as dynamic taint analysis and fine-grained behavioral analysis.
Rudder, the component for mixed concrete and symbolic execution. Rudder is an engine for mixed symbolic and concrete execution on binaries. At a high level, with a specified set of input sources of interest, Rudder can automatically explore different execution paths in a program determined by the input sources. It will automatically build logical formulas representing the constraints on the chosen input to take the followed paths.

Projects for COTS Protection and Malicious Code Defense

Below are some of our current research endeavors using the BitBlaze Binary Analysis platform for both COTS protection and malicious code defense.

Worm Defense
Worms such as CodeRed and SQL Slammer can compromise millions of hosts within hours or even minutes and have caused billions of dollars in estimated damage. How can we design and develop effective defense mechanisms against such fast, large scale worm attacks?
Sting is an automatic worm defense system which proposes a suite of novel techniques to automatically detect new exploits, perform in-depth diagnosis, and generate effective anti-bodies (vulnerability signatures and hardened binaries) to protect vulnerable hosts and networks from further attacks.
Detection and Analysis of Privacy-Breaching Malware
A myriad of malware such as keyloggers, Browser-helper Objects(BHO) based spyware, rootkits, backdoors, accesses and leaks users' sensitive information and breaches users' privacy. Can we have a unified approach to identify such privacy-breaching malware despite their widely-varied appearance?
Panorama proposes a unified approach to detect privacy-breaching malware using whole-system dynamic taint analysis.
Automatic Malware Dissection and Trigger-based Behavior Analysis
Malware often has embedded behavior which is only exhibitted when certain conditions are met. Such trigger-based behavior includes time bombs, logic bombs, and botnets programs which reacts to commands. Static analysis of malware often provides little utility due to code packing and obfuscation. Vanilla dynamic analysis can only provides limited view since the trigger conditions are usually not met. How can we design automatic analysis methods to uncover the trigger conditions and trigger-based behavior hidden in malware?
BitScope enables automatic exploration of program execution paths in malware to uncover trigger conditions (such as the time used in time bombs and commands in botnet programs) and trigger-based behavior, using mixed concrete and symbolic execution. BitScope also provides in-depth analysis of the input/output behavior of the malware.
Deviation Detection in Binaries
Many network protocols and services have several different implementations. Automatically identifying deviations in different implementations of the same protocol/service can enable the detection of potential implementation errors without protocol specification, and can enable automatic generation of fingerprints to identify an implementation remotely. How can we automatically identify such deviations in binaries implementing the same specification?
Deviation Detection automatically identifies deviations in different binaries to detect implementation errors and generate fingerprints. It is achieved by building symbolic formulas that characterize how each binary processes an input.
Protocol Reverse Engineering and Application Dialogue Replay
Many network protocols are proprietary or have no well documented specification. However, many security applications require protocol reverse engineering and application dialogue (network trace) replay.
Reverser and Replayer automatically extract information about network protocols and enables application dialogue replay using binary analysis.

Members

Faculty: Dawn Song

Postdocs:

Zhenkai Liang

Students:

Staff:Ivan Jager

Alumni:

Xeno Kovah
Eric Li