Macaw: A Machine Code Toolbox for the Busy Binary Analyst

On arXiv, a pre-print about the Galois Macaw binary analysis framework and a collection of projects that have been built with it.

Over a decade of development, we have used Macaw to support an industrial research team in building tools for machine code-related tasks. As such, the name ‘Macaw’ refers not just to the framework, but also a suite of tools that are built on top of it. We describe Macaw in depth and describe the different static and dynamic analyses that it performs, many powered by an SMT-based symbolic execution engine. We put a particular focus on interoperability between machine code and higher-level languages, including binary lifting from x86 to LLVM, as well verifying the correctness of mixed C and assembly code.

[Read more]

Measuring the Privacy of Computations

My colleague José Calderón wrote a blog post about Measuring the Privacy of Computations about some work we did as part of the DARPA Brandeis program.

Secure computation enables users to compute some result without revealing the inputs. Privacy schemes that are shown to only reveal outputs are said to have input privacy. However, learning these outputs still tells you something about the private inputs. The important question is: “how much?”

The implementation for the luigi-qif tool mentioned in this post is available on GitHub.

[Read more]

MATE: Merged Analysis to prevenT Exploits

Part of DARPA’s CHESS program, MATE takes a human-machine hybrid approach to enable the discovery of highly application-specific vulnerabilities using code property graphs. MATE aims to enable novices and experts to quickly understand security-relevant code and apply automated analyses to find vulnerabilities.

We’ve open sourced MATE! Check it out the documentation and source on GitHub.

[Read more]

Composition Challenges in Multi-Variant Execution

Composition Challenges for Automated Software Diversity

Over the past 20 years, a variety of automated software diversity techniques have been proposed. Some techniques randomize aspects of the implementation that are left undefined by the source language specification, such as code layout, stack layout, or locations of heap-allocated objects. Others insert instrumentation or obfuscation that is transparent from an application perspective, e.g. using XOR masks to obscure data values in memory or hiding code pointers using jump tables. A common assumption is that layering these techniques improves security due to increased entropy in the resulting binary. In this paper we examine this assumption and show that it fails to hold in general. In particular, it fails in one of the strongest deployment models for software diversity—that of multiple diverse variants running together in a multi-variant execution environment (MVEE) where attacks manifest as detectable behavioral divergence. We present several examples of diversity combinations that are vulnerable to attack in an MVEE even when none of the component techniques are vulnerable in isolation. Based on these results, we present guidance on which techniques do combine well and suggestions for effective deployment of diversity in MVEEs.

[Read more]