Code & Data

We support the development of open-source software in research. We make several of our research tools and datasets publicly available, such that the scientific community can reproduce our results and further advance our work.

Adversarial Learning

Scaler: Image-Scaling Attacks in Machine Learning

This project studies image-scaling attacks, a new form of attacks that allow an adversary to manipulate images, such that they change their content during downscaling. Image-scaling attacks are a considerable threat, as scaling is omnipresent in computer vision. Moreover, these attacks are agnostic to the learning model and training data, affecting any learning-based system operating on images.

Code Data Paper

Elsa: Explainable Learning in Security Applications

In this project, we study techniques of explainable machine learning in security applications. We find that the explanations generated by these techniques can differ significantly depending on the security task and learning model. At the same time, it is unclear how explanations can be compared in order to decide if one method is “better” than another one. As a result, we devise novel critera for comparing and evaluating explanations methods in computer security.

Code Data Paper

Imitator: Misleading Code Stylometry using Adversarial Learning

In this project, we attack methods for authorship attribution of source code using adversarial learning. We exploit that these methods rest on machine learning and thus can be deceived by adversarial examples of source code. Our attack performs a series of semantics-preserving code transformations that mislead the attribution but appear plausible to a developer. Our attack and the datasets are publicly available.

Code Data Paper

Twins: Adversarial Learning and Digital Watermarking

In this research project we explore similarities between machine learning and digital watermarking under attack. As part of the project, we have developed a unified view on attacks in both domains and created a framework for modeling evasion and poisoning attacks. The code and datasets of our case studies are publicly available.

Code Data Paper

Vulnerability Discovery

Joern: A Robust Tool for Static Code Analysis

Joern is a tool for robust analysis of C/C++ code. It generates abstract syntax trees, control flow graphs and searchable indexes of code constructs. It has been specifically designed to meet the needs of code auditors, who often find themselves in a situation where constructing a working build environment is not feasible. Joern enables one to write quick-and-dirty but language-aware static analysis tools.

Code Paper

Pulsar: Protocol Learning, Simulation and Stateful Fuzzing

Pulsar is a network fuzzer with automatic protocol learning and simulation capabilites. The tool allows to model a protocol through machine learning techniques. The learned models can be used to simulate communication between Pulsar and a real client or server which, in combination with a series of fuzzing primitives, enables to test the implementation of an unknown protocol for errors in deeper states of its state machine.

Code Paper

Malware Detection and Analysis

Drebin: Dataset of Malicious Android Applications

The Drebin dataset consists of roughly 5,000 malicious Android applications that have been collected as part of the Mobile Sandbox project between 2010 and 2012. The dataset can be used to experiment with Android malware and compare different detection approaches.

Data Paper

Adagio: Structural Analysis and Detection of Android Malware

Adagio is a collection of Python modules for analyzing and detecting Android malware. These modules allow to extract labeled call graphs from Android APKs or DEX files and apply an explicit feature map that captures their structural relationships. Additional modules provide classes for designing binary or multiclass classification experiments and applying machine learning for detection of malicious structure.

Code Paper

Malheur: Automatic Analysis of Malware Behaviour

Malheur is a tool for the automatic analysis of program behavior recorded from malicious software (malware). It has been designed to support the regular analysis of malicious software and the development of detection and defense measures. Malheur allows for identifying novel classes of malware with similar behavior and assigning unknown malware to discovered classes using machine learning.

Code Data Paper

Data Analysis

Harry: A Tool for Measuring String Similarity

Harry is a small tool for comparing strings and measuring their similarity. The tool supports several common distance and kernel functions for strings, such as the Levenshtein (edit) distance, the Jaro-Winkler distance and the compression distance. Harry is implemented using OpenMP, such that its runtime scales linear with the number of available CPU cores.

Code Paper

Sally: A Tool for Embedding Strings in Vector Spaces

Sally is a small tool for mapping a set of strings to a set of vectors. This mapping is referred to as embedding and allows for applying techniques of machine learning and data mining for analysis of string data. Sally can applied to several types of string data, such as text documents, DNA sequences or log files, where it can handle common formats such as directories, archives and text files.

Code Paper

Salad: A Content Anomaly Detector based on n-Grams

Salad is an efficient and flexible implementation of the well-known anomaly detection method Anagram. The method uses n-grams (substrings of length n) maintained in a Bloom filter for efficiently detecting anomalies in large sets of string data. Salad extends the original method by supporting n-grams of bytes and words as well as training with two classes.

Code Paper

Security Analysis

A Study on the Effectivity of Jailbreak Detection in Banking Apps

Jailbreaks remove vital security mechanisms, which are necessary to ensure a trusted environment that allows to protect sensitive data, such as login credentials and transaction numbers (TANs). We find that all but one banking apps, available in the iOS App Store, can be fully compromised by trivial means without reverse-engineering, manipulating the app, or other sophisticated attacks.

Code Paper

Security Analysis of Devolo HomePlug Devices

We have conducted a thorough security analysis of so-called HomePlug devices by Devolo, which are used to establish network communication over power lines. We have identified multiple security issues and find that hundreds of vulnerable devices are openly connected to the Internet across Europe. 87% run an outdated firmware, showing the deficiency of manual updates in comparison to automatic ones.

Code Paper

Code & Data

Adversarial Learning

Scaler: Image-Scaling Attacks in Machine Learning

Elsa: Explainable Learning in Security Applications

Imitator: Misleading Code Stylometry using Adversarial Learning

Twins: Adversarial Learning and Digital Watermarking

Vulnerability Discovery

Joern: A Robust Tool for Static Code Analysis

Pulsar: Protocol Learning, Simulation and Stateful Fuzzing

Malware Detection and Analysis

Drebin: Dataset of Malicious Android Applications

Adagio: Structural Analysis and Detection of Android Malware

Malheur: Automatic Analysis of Malware Behaviour

Data Analysis

Harry: A Tool for Measuring String Similarity

Sally: A Tool for Embedding Strings in Vector Spaces

Salad: A Content Anomaly Detector based on n-Grams

Security Analysis

A Study on the Effectivity of Jailbreak Detection in Banking Apps

Security Analysis of Devolo HomePlug Devices

For All Visitors

For Students

Internal Tools

Contact