Protocol reverse engineering, the process of extracting the application-level
protocol used by an implementation without access to the protocol specification,
has become increasingly important for network security. Knowledge of
application-level protocol format is essential for many network security
applications, such as vulnerability discovery, intrusion detection systems,
protocol analyzers for network monitoring and signature-based filtering,
fingerprint generation, application dialogue replay, detecting services running
on non-standard ports, and mapping traffic to applications. Many protocols in
use, especially on the enterprise network, are closed protocols (i.e., no
publicly available protocol specification). Even for protocols with a publicly
available specification, certain implementations may not exactly follow the
specification.
Protocol reverse engineering aims to extract the
application-level protocol used by an implementation, without access to the
protocol specification. Thus, protocol reverse engineering is an invaluable tool
for the above network security applications. Currently, protocol reverse
engineering is mostly a painstaking manual task. Attempts to reverse engineer
closed protocols such as the MSN Messenger and Samba protocols from Microsoft,
the Yahoo Messenger protocol, or the OSCAR and ICQ protocols from AOL, have all
been long term efforts lasting many years. In addition, protocol reverse
engineering is not a once-and-done effort, since existing protocols are often
extended to support new functionality. Thus, to successfully reverse engineer a
protocol in a timely manner and keep up the effort through time, we need
automatic methods.
With Reverser, we aim to develop an automatic
tool for protocol reverse engineering. Reverser's methods are
based on binary analysis, using a mix of static and dynamic analysis. In
particular, Reverser uses a unique intuition, the way that an implementation of
the protocol processes the received application data reveals a wealth of
information about the protocol format.
The ability to accurately replay application protocol
dialogs is useful in many security-oriented applications,
such as replaying an exploit for forensic analysis or
demonstrating an exploit to a third party.
A central challenge in application dialog replay is that the dialog
intended for the original host will likely not be accepted
by another without modication. For example, the dialog may
include or rely on state specic to the original host such as
its hostname, a known cookie, etc. In such cases, a
straight-forward byte-by-byte replay to a different host with a different state
(e.g., different hostname) than the original observed dialog participant will
likely fail. These state-dependent protocol fields must be updated to reflect
the different state of the different host for replay to succeed.
With Replayer, we aim to develop an automatic tool for application
protocol dialog replay. As a first step, we formally define the replay problem
and create the first sound
solution to the replay problem: replay
succeeds whenever our approach yields an answer.
To achieve this goal,
Replayer is based on binary analysis, making a novel use of program
verification techniques such as theorem proving and weakest pre-condition.