The Web Never Forgets: Persistent Tracking Mechanisms in the Wild

@article{Acar2014TheWN,
  title={The Web Never Forgets: Persistent Tracking Mechanisms in the Wild},
  author={Gunes Acar and Christian Eubank and Steven Englehardt and Marc Ju{\'a}rez and Arvind Narayanan and Claudia D{\'i}az},
  journal={Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security},
  year={2014}
}
We present the first large-scale studies of three advanced web tracking mechanisms - canvas fingerprinting, evercookies and use of "cookie syncing" in conjunction with evercookies. Canvas fingerprinting, a recently developed form of browser fingerprinting, has not previously been reported in the wild; our results show that over 5% of the top 100,000 websites employ it. We then present the first automated study of evercookies and respawning and the discovery of a new evercookie vector, IndexedDB… 
Automated discovery of privacy violations on the web
TLDR
A critical look at how the API design process can be changed to prevent such misuse in the future is taken, and novel detection methods and results for persistent tracking techniques, including: device fingerprinting, cookie syncing, and cookie respawning are presented.
Persistent Tracking in Modern Browsers
TLDR
This paper introduces a novel tracking mechanism that misuses a simple yet ubiquitous browser feature: favicons, and finds that combining this favicon-based tracking technique with immutable browser-fingerprinting attributes that do not change over time allows a website to reconstruct a 32-bit tracking identifier in 2 seconds.
Cookies That Give You Away: The Surveillance Implications of Web Tracking
TLDR
It is shown that foreign users are highly vulnerable to the NSA's dragnet surveillance due to the concentration of third-party trackers in the U.S. Using measurement units in various locations, this work introduces a methodology that combines web measurement and network measurement.
PriVaricator: Deceiving Fingerprinters with Little White Lies
TLDR
In PriVaricator the power of randomization is used to "break" linkability by exploring a space of parameterized randomization policies, and renders all the fingerprinters tested ineffective, while causing minimal damage on a set of 1000 Alexa sites on which they were tested.
XHOUND: Quantifying the Fingerprintability of Browser Extensions
TLDR
It is shown that an extension's organic activity in a page's DOM can be used to infer its presence, and XHound, the first fully automated system for fingerprinting browser extensions is developed, is developed.
Beyond Cookie Monster Amnesia: Real World Persistent Online Tracking
TLDR
C crawled the 10,000 most popular websites to give insights into the number of websites that are using the technique, which websites are collecting fingerprinting information, and exactly what information is being retrieved.
Online Tracking: A 1-million-site Measurement and Analysis
TLDR
The largest and most detailed measurement of online tracking conducted to date, based on a crawl of the top 1 million websites, is presented, which demonstrates the OpenWPM platform's strength in enabling researchers to rapidly detect, quantify, and characterize emerging online tracking behaviors.
I Know What You Did Last Summer: New Persistent Tracking Mechanisms in the Wild
TLDR
This paper is the first to suggest Web tracking as the main use case of Web Storage, Web SQL Database, and Indexed Database API, and suggests that Web Storage is the most used among the three technologies.
Hiding in the Crowd: an Analysis of the Effectiveness of Browser Fingerprinting at Large Scale
TLDR
The key insight is that the percentage of unique fingerprints in the dataset is much lower than what was reported in the past: only 33.6% of fingerprints are unique by opposition to over 80% in previous studies.
EssentialFP: Exposing the Essence of Browser Fingerprinting
TLDR
This paper argues that the pattern of gathering information from a wide browser API surface (multiple browser-specific sources) and communicating the information to the network (network sink) captures the essence of fingerprinting, and demonstrates that information flow tracking is an excellent fit for exposing this pattern.
...
...

References

SHOWING 1-10 OF 75 REFERENCES
Tracking the Trackers: Fast and Scalable Dynamic Analysis of Web Content for Privacy Violations
TLDR
A novel technique called principal-based tainting is developed that allows us to perform dynamic analysis of JavaScript execution with lowered performance overhead, and shows that privacy attacks are more prevalent and serious than previously known.
Shining the Floodlights on Mobile Web Tracking — A Privacy Survey
TLDR
This first published large-scale study of mobile web tracking is presented, comparing tracking across five physical and emulated mobile devices with one desktop device as a benchmark.
PriVaricator: Deceiving Fingerprinters with Little White Lies
TLDR
In PriVaricator the power of randomization is used to "break" linkability by exploring a space of parameterized randomization policies, and renders all the fingerprinters tested ineffective, while causing minimal damage on a set of 1000 Alexa sites on which they were tested.
Cookieless Monster: Exploring the Ecosystem of Web-Based Device Fingerprinting
TLDR
By analyzing the code of three popular browser-fingerprinting code providers, it is revealed the techniques that allow websites to track users without the need of client-side identifiers and how fragile the browser ecosystem is against fingerprinting through the use of novel browser-identifying techniques.
Detecting and Defending Against Third-Party Tracking on the Web
TLDR
This work develops a client-side method for detecting and classifying five kinds of third-party trackers based on how they manipulate browser state, and finds that no existing browser mechanisms prevent tracking by social media sites via widgets while still allowing those widgets to achieve their utility goals, which leads to a new defense.
FPDetective: dusting the web for fingerprinters
TLDR
The design, implementation and deployment of FPDetective, a framework for the detection and analysis of web-based fingerprinters, are reported on, showing that there needs to be a change in the way users, companies and legislators engage with fingerprinting.
Fast and Reliable Browser Identification with JavaScript Engine Fingerprinting
TLDR
A new method for identifying web browsers based on the underlying Javascript engine, which can be executed on the client side within a fraction of a second, is proposed, three orders of magnitude faster than previous work on Javascript engine fingerprinting, and can be implemented with well below a few hundred lines of code.
How Unique Is Your Web Browser?
  • P. Eckersley
  • Computer Science
    Privacy Enhancing Technologies
  • 2010
TLDR
The degree to which modern web browsers are subject to "device fingerprinting" via the version and configuration information that they will transmit to websites upon request is investigated, and what countermeasures may be appropriate to prevent it is discussed.
Fingerprinting Information in JavaScript Implementations
TLDR
This paper identifies two new avenues for browser fingerprinting, one of which subverts the whitelist mechanism of the popular NoScript Firefox extension, which selectively enables web pages’ scripting privileges to increase privacy by allowing a site to determine if particular domains exist in a user's NoScript whitelist.
SHPF: Enhancing HTTP(S) Session Security with Browser Fingerprinting
TLDR
This paper identifies HTML5 and CSS features that can be used for browser fingerprinting and to identify or verify a browser without the need to rely on the User Agent string.
...
...