Dissecting Privacy Risks in Biomedical Data
Pascal Berrang (CISPA, Saarland University), Mathias Humbert (Swiss Data Science Center, ETH/EPFL), Yang Zhang (CISPA, Saarland University), Irina Lehmann (Helmholtz Centre for Environmental Research Leipzig, UFZ, Leipzig), Roland Eils (German Cancer Research Center (DKFZ) & University of Heidelberg), Michael Backes (CISPA, Saarland University)
The decreasing costs of molecular profiling has fueled the biomedical research community with a plethora of new types of biomedical data, enabling a breakthrough towards a more precise and personalized medicine. However, the release of these intrinsically highly sensitive data poses a new severe privacy threat. While biomedical data is largely associated with our health, there also exist various correlations between different types of biomedical data, along the temporal dimension, and also in-between family members. However, so far, the security community has focused on privacy risks stemming from genomic data, largely overlooking the manifold interdependencies between other biomedical data. In this paper, we present a generic framework for quantifying the privacy risks in biomedical data taking into account the various interdependencies between data (i) of different types, (ii) from different individuals, and (iii) at different time. To this end, we rely on a Bayesian network model that allows us to take all aforementioned dependencies into account and run exact probabilistic inference attacks very efficiently. Furthermore, we introduce a generic algorithm for building the Bayesian network, which encompasses expert knowledge for known dependencies, such as genetic inheritance laws, and learns previously unknown dependencies from the data. Then, we conduct a thorough inference risk evaluation with a very rich dataset containing genomic and epigenomic data of mothers and children over multiple years. Besides effective probabilistic inference, we further demonstrate that our Bayesian network model can also serve as a building block for other attacks. We show that, with our framework, an adversary can efficiently identify the parent-child relationships based on methylation data with a success rate of 95%.
CRYSTALS - Kyber: a CCA-secure module-lattice-based KEM
Joppe Bos (NXP), Leo Ducas (CWI), Eike Kiltz (Ruhr-University Bochum), Tancrede Lepoint (SRI), Vadim Lybashevsky (IBM Research), John M. Schanck (University of Waterloo), Peter Schwabe (Radboud University), Damien Stehle (Universite de Lyon)
Rapid advances in quantum computing, together with the announcement by the National Institute of Standards and Technology (NIST) to define new standards for digital-signature, encryption, and key-establishment protocols, have created significant interest in post-quantum cryptographic schemes. This paper introduces Kyber (part of CRYSTALS – Cryptographic Suite for Algebraic Lattices – a package submitted to NIST post-quantum standardization effort in November 2017), a portfolio of post-quantum cryptographic primitives built around a key-encapsulation mechanism (KEM), based on hardness assumptions over module lattices. Our KEM is most naturally seen as a successor to the NEWHOPE KEM (Usenix 2016). In particular, the key and ciphertext sizes of our new construction are about half the size, the KEM offers CCA instead of only passive security, the security is based on a more general (and flexible) lattice problem, and our optimized implementation results in essentially the same running time as the aforementioned scheme. We first introduce a CPA-secure public-key encryption scheme, apply a variant of the Fujisaki–Okamoto transform to create a CCA-secure KEM, and eventually construct, in a black-box manner, CCA-secure encryption, key exchange, and authenticated-key-exchange schemes. The security of our primitives is based on the hardness of Module-LWE in the classical and quantum random oracle models, and our concrete parameters conservatively target more than 128 bits of post-quantum security.
Masters of Time: An Overview of the NTP Ecosystem
Teemu Rytilahti (Horst Görtz Institute for IT-Security, Ruhr University Bochum), Dennis Tatang (Horst Görtz Institute for IT-Security, Ruhr University Bochum), Janosch Köpper (Horst Görtz Institute for IT-Security, Ruhr University Bochum), Thorsten Holz (Horst Görtz Institute for IT-Security, Ruhr University Bochum)
The Network Time Protocol (NTP) is currently the most commonly used approach to keeping the clocks of computing devices accurate. It operates in the background of many systems; however, it is often important because if NTP fails in providing the correct time, multiple applications such as security protocols like TLS can fail. Despite its crucial practical role, only a limited number of measurement studies have focused on the NTP ecosystem. In this paper, we report the results of an in-depth longitudinal study of the services provided by the NTP Pool Project, which enables volunteers to offer their NTP services to other Internet users in a straightforward manner. We supplement these observations with an analysis of other readily available NTP servers, such as those offered by OS vendors or those that can be freely found on the Internet. The analysis indicates a reliance on a small set of servers that are (at least indirectly) responsible for providing the time for the Internet. Furthermore, this paper considers the impact of several incidents that the authors observed between December 2016 and April 2017. To complement this study, we also perform an analysis of multiple geographical regions from the operator’s perspective, spanning a period of 5 months. A coarse-grained categorization of client requests allows us to categorize 95 percent of our incoming traffic as NTP- and SNTP-like traffic (the latter being a simpler, but more error-prone, form of NTP); we observe that up to 75 percent of all requests originated from SNTP-like clients. With this in mind, we consider what kind of harm a rogue server administrator could cause to users.
Probabilistic Obfuscation through Covert Channels
Jon Stephens (The University of Arizona), Babak Yadegari (The University of Arizona), Christian Collberg (The University of Arizona), Saumya Debray (The University of Arizona), Carlos Scheidegger (The University of Arizona)
This paper presents a program obfuscation framework that uses covert channels through the program’s execution environment to obfuscate information flow through the program. Unlike prior works on obfuscation, the use of covert channels removes visible information flows from the computation of the program and reroutes them through the program’s runtime system and/or the operating system. This renders these information flows, and the corresponding control and data dependencies, invisible to program analysis tools such as symbolic execution engines. Additionally, we present the idea of probabilistic obfuscation which uses imperfect covert channels to leak information with some probabilistic guarantees. Experimental evaluation of our approach against state of the art detection and analysis techniques show the engines are not well-equipped to handle these obfuscations, particularly those of the probabilistic variety
Have your PI and Eat it Too: Practical Security on a Low-cost Ubiquitous Computing Platform
Amit Vasudevan (CyLab / CMU), Sagar Chaki (SEI / CMU), Amit Vasudevan (CyLab, Carnegie Mellon University)
Robust security on a commodity low-cost and popular computing platform is a worthy goal for today’s Internet of Things (IoT) and embedded ecosystems. We present the first practical security architecture on the Raspberry PI (PI), a ubiquitous and popular low-cost compute module. Our architecture and framework — called UBERPI — focuses on three goals which are keys to achieving practical security: commodity compatibility (e.g., runs unmodified Raspbian/Debian Linux) and unfettered access to platform hardware, performance (avg. 2%–6% overhead), and low trusted computing base and complexity (modular 5544 SLoC). We present a full implementation followed by a comprehensive evaluation and lessons learned. We believe that our contributions and findings elevate the PI into a next generation, secure, low-cost IoT embedded computing platform.
SoK: Security and Privacy in Machine Learning
Nicolas Papernot (Penn State), Patrick McDaniel (Penn State), Arunesh Sinha (University of Michigan), Michael P. Wellman (University of Michigan)
Advances in machine learning (ML) in recent years have enabled a dizzying array of applications such as data analytics, autonomous systems, and security diagnostics. ML is now pervasive—new systems and models are being deployed in every domain imaginable, leading to widespread deployment of software based inference and decision making. There is growing recognition that ML exposes new vulnerabilities in software systems, yet the technical community’s understanding of the nature and extent of these vulnerabilities remains limited. We systematize findings on ML security and privacy, focusing on attacks identified on these systems and defenses crafted to date. We articulate a comprehensive threat model for ML, and categorize attacks and defenses within an adversarial framework. Key insights resulting from works both in the ML and security communities are identified and the effectiveness of approaches are related to structural elements of ML algorithms and the data used to train them. In particular, it is apparent that constructing a theoretical understanding of the sensitivity of modern ML algorithms to the data they analyze, a la PAC theory, will foster a science of security and privacy in ML.
Just In Time Hashing
Benjamin Harsha (Purdue University), Jeremiah Blocki (Purdue University)
In the past few years billions of user passwords have been exposed to the threat of offline cracking attempts. Such brute-force cracking attempts are increasingly dangerous as password cracking hardware continues to improve and as users continue to select low entropy passwords. Key-stretching techniques such as hash iteration and memory hard functions can help to mitigate the risk, but increased key-stretching effort necessarily increases authentication delay so this defense is fundamentally constrained by usability concerns. We introduce Just in Time Hashing (JIT), a client side key-stretching algorithm to protect user passwords against offline brute-force cracking attempts without increasing delay for the user. The basic idea is to exploit idle time while the user is typing in their password to perform extra key-stretching. As soon as the user types in the first character(s) of their password our algorithm immediately begins filling memory with hash values derived from the character(s) that the user has typed thus far. We conduct a user study to guide the development of JIT e.g. by determining how much extra key-stretching could be performed during idle cycles or how many consecutive deletions JIT may need to handle. Our security analysis demonstrates that JIT can substantially increase guessing costs over traditional key-stretching algorithms with equivalent (or less) authentication delay. Specifically an empirical evaluation using existing password datasets demonstrates that JIT increases guessing costs by nearly an order of magnitude in comparison to standard key-stretching techniques with comparable delay. We provide a proof-of-concept implementation of a Just in Time Hashing algorithm by modifying Argon2.
Forgotten Siblings: Unifying Attacks on Machine Learning and Digital Watermarking
Erwin Quiring (TU Braunschweig), Daniel Arp (TU Braunschweig), Konrad Rieck (TU Braunschweig)
Machine learning is increasingly used in security-critical applications, such as autonomous driving, face recognition, and malware detection. Most learning methods, however, have not been designed with security in mind and thus are vulnerable to different types of attacks. This problem has motivated the research field of adversarial machine learning that is concerned with attacking and defending learning methods. Concurrently, a separate line of research has tackled a very similar problem: In digital watermarking, a pattern is embedded in a signal in the presence of an adversary. As a consequence, this research field has also extensively studied techniques for attacking and defending watermarking methods. The two research communities have worked in parallel so far, unnoticeably developing similar attack and defense strategies. This paper is a first effort to bring these communities together. To this end, we present a unified notation of black-box attacks against machine learning and watermarking. To demonstrate its efficacy, we apply concepts from watermarking to machine learning and vice versa. We show that countermeasures from watermarking can mitigate recent model-extraction attacks and, similarly, that techniques for hardening machine learning can fend off oracle attacks against watermarks. We further demonstrate a novel threat for watermarking schemes based on recent deep learning attacks from adversarial learning. Our work provides a conceptual link between two research fields and thereby opens novel directions for improving the security of both, machine learning and digital watermarking.
COVERN: A Logic for Compositional Verification of Information Flow Control
Toby Murray (University of Melbourne and Data61), Robert Sison (CSE, UNSW and Data61), Kai Engelhardt (CSE, UNSW and Data61)
Shared memory concurrency is pervasive in modern programming, including in systems that must protect highly sensitive data. Recently, verification has finally emerged as a practical tool for proving interesting security properties of real programs, particularly information flow control (IFC) security. Yet there remain no general logics for verifying IFC security of shared-memory concurrent programs. In this paper we present the first such logic, COVERN (Compositional Verification of Noninterference) and its proof of soundness via a new generic framework for general rely-guarantee IFC reasoning. We apply COVERN to model and verify the security-critical software functionality of the Cross Domain Desktop Compositor, an embedded device that facilitates simultaneous and intuitive user interaction with multiple classified networks while preventing leakage between them. To our knowledge this is the first foundational, machine-checked proof of IFC security for a non-trivial shared-memory concurrent program in the literature.
DeepRefiner: Multi-layer Android Malware Detection System Applying Deep Neural Networks
Ke Xu (Singapore Management University), Yingjiu Li (Singapore Management University), Robert H. Deng (Singapore Management University), Kai Chen (Chinese Academy of Sciences)
As malicious behaviors vary significantly across mobile malware, it is challenging to detect malware both efficiently and effectively. Also due to the continuous evolution of malicious behaviors, it is difficult to extract features by laborious human feature engineering and keep up with the speed of malware evolution. To solve these challenges, we propose DeepRefiner to identify malware both efficiently and effectively. The novel technique enabling effectiveness is the semantic-based deep learning. We use Long Short Term Memory on the semantic structure of Android bytecode, avoiding missing the details of method-level bytecode semantics. To achieve efficiency, we apply Multilayer Perceptron on the xml files based on the finding that most malware can be efficiently identified using information only from xml files. We evaluate the detection performance of DeepRefiner with 62,915 malicious applications and 47,525 benign applications, showing that DeepRefiner effectively detects malware with an accuracy of 97.74% and a false positive rate of 2.54%. We compare DeepRefiner with a state-of-the-art single classifier-based detection system, StormDroid, and ten widely used signature-based anti-virus scanners. The experimental results show that DeepRefiner significantly outperforms StormDroid and anti-virus scanners. In addition, we evaluate the robustness of DeepRefiner against typical obfuscation techniques and adversarial samples. The experimental results demonstrate that DeepRefiner is robust in detecting obfuscated malicious applications.
In search of CurveSwap: Measuring elliptic curve implementations in the wild
Luke Valenta (University of Pennsylvania), Nick Sullivan (Cloudflare), Antonio Sanso (Adobe), Nadia Heninger (University of Pennsylvania)
We survey elliptic curve implementations from several vantage points. We perform internet-wide scans for TLS on a large number of ports, as well as SSH and IPsec to measure elliptic curve support and implementation behaviors, and collect passive measurements of client curve support for TLS. We also perform active measurements to estimate server vulnerability to known attacks against elliptic curve implementations, including support for weak curves, invalid curve attacks, and curve twist attacks. We estimate that 1.53% of HTTPS hosts, 0.04% of SSH hosts, and 4.04% of IKEv2 hosts that support elliptic curves do not perform curve validity checks as specified in elliptic curve standards. We describe how such vulnerabilities could be used to construct an elliptic curve parameter downgrade attack called CurveSwap for TLS, and observe that there do not appear to be combinations of weak behaviors we examined enabling a feasible CurveSwap attack in the wild. We also analyze source code for elliptic curve implementations, and find that a number of libraries fail to perform point validation for JSON Web Encryption, and find a flaw in the Java and NSS multiplication algorithms.
Mining ABAC Rules from Sparse Logs
Carlos Cotrini (ETH Zürich), Thilo Weghorn (ETH Zürich), David Basin (ETH Zürich)
Different methods have been proposed to mine attribute-based access control (ABAC) rules from logs. In practice, these logs are sparse in that they contain only a fraction of all possible requests. However, for sparse logs, existing methods mine and validate overly permissive rules, enabling privilege abuse. We define a novel measure, reliability, that quantifies how overly permissive a rule is and we show why other standard measures like confidence and entropy fail in quantifying over-permissiveness. We build upon state-of-the-art subgroup discovery algorithms and our new reliability measure to design Rhapsody, the first ABAC mining algorithm with correctness guarantees: Rhapsody mines a rule if and only if the rule covers a significant number of requests, its reliability is above a given threshold, and there is no equivalent shorter rule. We evaluate Rhapsody on different real-world scenarios using logs from Amazon and a computer lab at ETH Zurich. Our results show that Rhapsody generalizes better and produces substantially smaller rules than competing approaches.