Hi-Fi: collecting high-fidelity whole-system provenance

@inproceedings{Pohly2012HiFiCH,
  title={Hi-Fi: collecting high-fidelity whole-system provenance},
  author={Devin J. Pohly and Stephen E. McLaughlin and Patrick Mcdaniel and Kevin R. B. Butler},
  booktitle={Asia-Pacific Computer Systems Architecture Conference},
  year={2012}
}
Data provenance---a record of the origin and evolution of data in a system---is a useful tool for forensic analysis. However, existing provenance collection mechanisms fail to achieve sufficient breadth or fidelity to provide a holistic view of a system's operation over time. We present Hi-Fi, a kernel-level provenance system which leverages the Linux Security Modules framework to collect high-fidelity whole-system provenance. We demonstrate that Hi-Fi is able to record a variety of malicious… 

Figures and Tables from this paper

Practical whole-system provenance capture

CamFlow is presented, a whole-system provenance capture mechanism that integrates easily into a PaaS offering and illustrates the usability of the implementation by describing three such applications: demonstrating compliance with data regulations; performing fault/intrusion detection; and implementing data loss prevention.

Practical whole-system provenance capture access benefits you. Your

CamFlow is presented, a whole-system provenance capture mechanism that inte-grates easily into a PaaS offering and illustrates the usability of the implementation by describing three such applications: demonstrating compliance with data regulations; performing fault/intrusion detection; and implementing data loss prevention.

Trustworthy Whole-System Provenance for the Linux Kernel

Linux Provenance Modules (LPM) is presented, the first general framework for the development of provenance-aware systems, and is the first step towards widespread deployment of trustworthy provenANCE-aware applications.

High-throughput ingest of data provenance records into Accumulo

This paper investigates the use of D4M and Accumulo to support high-throughput data ingest of whole-system provenance data and finds that it is able to ingest 3,970 graph components per second.

A Comprehensive Survey on the State-of-the-art Data Provenance Approaches for Security Enforcement

A comparative study of the state-of-the-art approaches to provenance by classifying them based on frameworks, deployed techniques, and subjects of interest to discuss the emergence and scope of data provenance in IoT network.

Expressiveness Benchmarking for System-Level Provenance

An expressiveness benchmark consisting of tests intended to capture the provenance of individual system calls is proposed, which is presented work in progress on the benchmark examples for Linux and how they are handled by two different provenance tools, SPADE and OPUS.

Linux Provenance Modules : Secure Provenance Collection for the Linux Kernel

The Linux Provenance Modules (LPM) is presented, the first general framework for the development of provenance-aware systems that imposes as little as 0.6% performance overhead on system operation and introduces a mechanism for policy-reduced provenance that reduces the costs ofprovenance collection by up to 74% by identifying a system’s trusted computing base.

Taming the Costs of Trustworthy Provenance through Policy Reduction

This work proposes a policy-based approach to provenance filtering, leveraging the confinement properties provided by Mandatory Access Control systems in order to identify and isolate subdomains of system activity for which to collect provenance.

PR EP RI NT Runtime Analysis of Whole-System Provenance

CamQuery is a Linux Security Module that offers support for both userspace and in-kernel execution of analysis applications, and provides inline, realtime provenance analysis, making it suitable for implementing security applications.

Runtime Analysis of Whole-System Provenance

This work presents CamQuery, which provides inline, realtime provenance analysis, making it suitable for implementing security applications, and demonstrates the applicability of CamQuery to a variety of runtime security applications including data loss prevention, intrusion detection, and regulatory compliance.
...

References

SHOWING 1-10 OF 37 REFERENCES

Layering in Provenance-Aware Storage Systems

This work presents an architecture for provenance collection that facilitates the integration of provenance across multiple layers of abstraction and across network boundaries, and shows how the need to supportprovenance collection at multiple layers drives the architecture.

Panda: A System for Provenance and Data

Panda (for Provenance and Data) is a new project whose goal is to develop a general-purpose system that unifies concepts from existing provenance systems and overcomes some limitations in them. Panda

Story Book: An Efficient Extensible Provenance Framework

This work designs a simple, flexible and easily optimized provenance file system that treats provenance events as a generic event log, and shows that coupling the design with existing storage optimizations provides higher throughput than existing systems.

Tracking Emigrant Data via Transient Provenance

This work proposes a technique to extend data provenance to aid in determining potential sources of information leaks and presents the solution for tracking emigrant data and explains the minor changes to current provenance-aware storage systems required.

Transparently Gathering Provenance with Provenance Aware Condor

Provenance Aware Condor is a system that transparently gathers provenance while jobs run in Condor and is able to answer a wide range of questions about the files used by a job and the machines that execute jobs.

The Open Provenance Model core specification (v1.1)

Forensix: a robust, high-performance reconstruction system

This work argues that computing systems should, in fact, be built with automated analysis and recovery as a primary goal, and describes the design, implementation, and evaluation of Forensix: a robust, high-precision reconstruction and analysis system for supporting the computer equivalent of "TiVo".

Issues in Automatic Provenance Collection

The challenges the team encountered and the issues the team exposed as they developed an automatic provenance collector that runs at the operating system level are discussed.

Selective Versioning in a Secure Disk System

This work designs and implements a secure disk system, SVSDS, that performs selective, flexible, and transparent versioning of stored data, at the disk-level, and demonstrates that the space and performance overheads associated with selective versioning at thedisk level are minimal.

Runtime verification of authorization hook placement for the linux security modules framework

The approach for performing runtime verification, the design of the tools that implement this approach, and the anomalous situations found in an LSM-patched Linux 2.4.16 kernel are described.