BitBlaze
Binary Analysis for COTS Protection and Malicious Code
Defense
[Research Statement and Overview] [Current
Projects] [Publications] [Members]
Binary analysis is imperative for protecting COTS (common
off-the-shelf) programs and analyzing and defending against the myriad
of malicious code, where source code is unavailable, and the binary
may even be obfuscated. Also, binary analysis provides the ground
truth about program behavior since computers execute binaries
(executables), not source code. However, binary analysis is
challenging due to the lack of higher-level semantics.
Many higher level
techniques are often inadequate for analyzing even benign binaries,
let alone potentially malicious binaries.
Thus, we need to develop tools and techniques which work at the
binary level, can be used for analyzing COTS software, as well as malicious binaries.
The BitBlaze project aims to design and develop a powerful
binary analysis platform and employ the platform in order to (1) analyze and develop novel
COTS protection and diagnostic mechanisms and (2) analyze,
understand, and develop defenses against malicious code. The
BitBlaze project also strives to open new application areas of
binary analysis, which provides sound and effective solutions to
applications beyond software security and malicious code defense,
such as protocol reverse engineering and fingerprint generation.
The BitBlaze project consists of two central research directions: (1)
the design and development of the underlying BitBlaze Binary Analysis
Platform, and (2) applying the BitBlaze Binary Analysis Platform to
real security problems. The two research focii drive each other: as
new security problems arise, we develop new analysis
techniques. Similarly, we develop new analysis techniques in order to
better or more efficiently solve known problems. Below, we give an
overview of the two research directions.
The underlying BitBlaze Binary Analysis
Platform features a novel fusion of static and dynamic analysis
techniques, mixed concrete and symbolic execution, and whole-system
emulation and binary instrumentation. The BitBlaze platform has
different components for each task: Vine, TEMU, and
Rudder. The three components in tandem provide the power for
effective analysis of real-world binary programs for various
applications.
- Vine, the static analysis
component. Vine provides an an intermediate language for
assembly (ILA), and an infrastructure for analyzing programs
written in this language. ILA is a full language in
which programs can be written, type-checked, then compiled
down to assembly. We also provide analysis on the
ILA, such as abstract interpretation, dependency analysis, and
logical analysis via interfaces with theorem provers.
- TEMU, the dynamic analysis
component. TEMU provides a dynamic analysis environment
through whole-system emulation and dynamic binary
instrumentation. TEMU is OS-aware (i.e., it understands
OS-level semantics) and enables various fine-grained
dynamic analysis to build upon, such as dynamic taint
analysis and fine-grained behavioral analysis.
- Rudder, the component for mixed
concrete and symbolic execution. Rudder is an engine for
mixed symbolic and concrete execution on binaries. At a
high level, with a specified set of input sources of
interest, Rudder can automatically explore different
execution paths in a program determined by the input
sources. It will automatically build logical formulas
representing the constraints on the chosen input to take the
followed paths.
Projects for COTS Protection and Malicious Code Defense
Below are some of our current research endeavors using the BitBlaze
Binary Analysis platform for both COTS protection and malicious code
defense.
-
Worm Defense
Worms such as CodeRed and SQL Slammer can compromise
millions of hosts within hours or even minutes and have
caused billions of dollars in estimated damage. How can
we design and develop effective defense mechanisms
against such fast, large scale worm attacks?
Sting is an automatic
worm defense system which proposes a suite of novel
techniques to automatically detect new exploits, perform
in-depth diagnosis, and generate effective anti-bodies
(vulnerability signatures and hardened binaries) to
protect vulnerable hosts and networks from further
attacks.
-
Detection and
Analysis of Privacy-Breaching Malware
A myriad of malware such as keyloggers, Browser-helper
Objects(BHO) based spyware, rootkits, backdoors, accesses
and leaks users' sensitive information and breaches
users' privacy. Can we have a unified approach to
identify such privacy-breaching malware despite their
widely-varied appearance?
Panorama proposes a
unified approach to detect privacy-breaching malware
using whole-system dynamic taint analysis.
-
Automatic Malware
Dissection and Trigger-based Behavior
Analysis
Malware often has embedded behavior which is only
exhibitted when certain conditions are met. Such
trigger-based behavior includes time bombs, logic bombs,
and botnets programs which reacts to commands. Static
analysis of malware often provides little utility due to
code packing and obfuscation. Vanilla dynamic analysis
can only provides limited view since the trigger
conditions are usually not met. How can we design
automatic analysis methods to uncover the trigger
conditions and trigger-based behavior hidden in malware?
BitScope enables
automatic exploration of program execution paths in
malware to uncover trigger conditions (such as the time
used in time bombs and commands in botnet programs) and
trigger-based behavior, using mixed concrete and symbolic
execution. BitScope also provides in-depth analysis of
the input/output behavior of the malware.
-
Deviation
Detection in Binaries
Many network protocols and services have several
different implementations. Automatically identifying
deviations in different implementations of the same
protocol/service can enable the detection of potential
implementation errors without protocol specification, and
can enable automatic generation of fingerprints to
identify an implementation remotely. How can we
automatically identify such deviations in binaries
implementing the same specification?
Deviation Detection
automatically identifies deviations in different
binaries to detect implementation errors and generate
fingerprints. It is achieved by building symbolic formulas
that characterize how each binary processes an input.
-
Protocol Reverse
Engineering and Application Dialogue
Replay
Many network protocols are proprietary or have no well
documented specification. However, many security
applications require protocol reverse engineering and
application dialogue (network trace) replay.
Reverser and
Replayer automatically extract information about
network protocols and enables application dialogue replay
using binary analysis.
Faculty:
Dawn Song
Postdocs:
Students:
Staff:Ivan Jager
Alumni: