A Survey on Web Tracking: Mechanisms, Implications, and Defenses

@article{Bujlow2017ASO,
  title={A Survey on Web Tracking: Mechanisms, Implications, and Defenses},
  author={Tomasz Bujlow and Valent{\'i}n Carela-Espa{\~n}ol and Josep Sol{\'e}-Pareta and Pere Barlet-Ros},
  journal={Proceedings of the IEEE},
  year={2017},
  volume={105},
  pages={1476-1510}
}
Privacy seems to be the Achilles’ heel of today’s web. Most web services make continuous efforts to track their users and to obtain as much personal information as they can from the things they search, the sites they visit, the people they contact, and the products they buy. This information is mostly used for commercial purposes, which go far beyond targeted advertising. Although many users are already aware of the privacy risks involved in the use of internet services, the particular methods… 
I Know What You Did Last Summer: New Persistent Tracking Mechanisms in the Wild
TLDR
This paper is the first to suggest Web tracking as the main use case of Web Storage, Web SQL Database, and Indexed Database API, and suggests that Web Storage is the most used among the three technologies.
A comparison of web privacy protection techniques
Anonymity Online - Current Solutions and Challenges
TLDR
This paper discusses different techniques of tracking as a challenge to online anonymity, present current solutions on the application level as well as on the network level to provide anonymity, and points out avenues for future research in the field of online anonymity.
Anonymity Online – Current Solutions and Challenges
TLDR
This paper discusses different techniques of tracking as a challenge to online anonymity, present current solutions on the application level as well as on the network level to provide anonymity, and points out avenues for future research in the field of online anonymity.
A QUIC Look at Web Tracking
TLDR
This work investigates the feasibility of user tracking via QUIC from the perspective of an online service and reveals that the protocol design contains violations of privacy best practices through which a tracker can passively and uniquely identify clients across several connections.
A Usability Evaluation of Privacy Add-ons for Web Browsers
TLDR
This work conducted usability evaluations by utilising System Usability Scale and Think-Aloud Protocol on three popular privacy add-ons, i.e., DuckDuckGo Privacy Essentials, Ghostery and Privacy Badger, and suggests that the participants feel safer and trusting of their respective add-on.
Automated discovery of privacy violations on the web
TLDR
A critical look at how the API design process can be changed to prevent such misuse in the future is taken, and novel detection methods and results for persistent tracking techniques, including: device fingerprinting, cookie syncing, and cookie respawning are presented.
Clickstream tracking of TOR users: may be easier than you think
TLDR
If used in its default settings, the TOR browser provides little if any protection against four most common forms of user tracking; hence, to achieve true online anonymity, extra efforts and vigilance need to be exercised on the part of the TOR user.
Network Measurements for Web Tracking Analysis and Detection: A Tutorial
TLDR
Digital society has developed to a point where it is nearly impossible for a user to know what it is happening in the background when using the Internet, so it is necessary to perform network measurements not only at the network layer, but also at the application layer.
Towards accurate detection of obfuscated web tracking
TLDR
This paper proposes a new methodology based on dynamic code analysis that monitors the actual JavaScript calls made by the browser and compares them to the original source code of the website in order to detect obfuscated tracking.
...
...

References

SHOWING 1-10 OF 78 REFERENCES
User Privacy and the Evolution of Third-Party Tracking Mechanisms on the World Wide Web
TLDR
A computer security rubric is applied to the behavior and tracking methodologies of third parties in order to show their adversarial qualities in matters of user privacy.
Cookieless Monster: Exploring the Ecosystem of Web-Based Device Fingerprinting
TLDR
By analyzing the code of three popular browser-fingerprinting code providers, it is revealed the techniques that allow websites to track users without the need of client-side identifiers and how fragile the browser ecosystem is against fingerprinting through the use of novel browser-identifying techniques.
Cookies That Give You Away: The Surveillance Implications of Web Tracking
TLDR
It is shown that foreign users are highly vulnerable to the NSA's dragnet surveillance due to the concentration of third-party trackers in the U.S. Using measurement units in various locations, this work introduces a methodology that combines web measurement and network measurement.
User Tracking on the Web via Cross-Browser Fingerprinting
TLDR
It is shown that a part of the IP address, the availability of a specific font set, the time zone, and the screen resolution are enough to uniquely identify most users of the five most popular web browsers, and that user agent strings are fairly effective but fragile identifiers of a browser instance.
Timing attacks on Web privacy
TLDR
A way of reengineering browsers to prevent most of these attacks, which allow a malicious Web site to determine whether or not the user has recently visited some other, unrelated Web page by measuring the time the user’s browser requires to perform certain operations.
Web Privacy Census
TLDR
It is found that users who merely visit the homepages of the top 100 most popular sites would collect over 6,000 HTTP cookies in the process, and Google's ability to track users on popular websites is unparalleled, and it approaches the level of surveillance that only an Internet Service Provider can achieve.
Detecting and Defending Against Third-Party Tracking on the Web
TLDR
This work develops a client-side method for detecting and classifying five kinds of third-party trackers based on how they manipulate browser state, and finds that no existing browser mechanisms prevent tracking by social media sites via widgets while still allowing those widgets to achieve their utility goals, which leads to a new defense.
FPDetective: dusting the web for fingerprinters
TLDR
The design, implementation and deployment of FPDetective, a framework for the detection and analysis of web-based fingerprinters, are reported on, showing that there needs to be a change in the way users, companies and legislators engage with fingerprinting.
TrackAdvisor: Taking Back Browsing Privacy from Third-Party Trackers
TLDR
TrackAdvisor is developed, arguably the first method that utilizes Machine Learning to identify the HTTP requests carrying sensitive information to third-party trackers with very high accuracy (100 % Recall and 99.4 Precision), which would raise the public awareness to its potential privacy risks.
The Web Never Forgets: Persistent Tracking Mechanisms in the Wild
TLDR
The evaluation of the defensive techniques used by privacy-aware users finds that there exist subtle pitfalls --- such as failing to clear state on multiple browsers at once - in which a single lapse in judgement can shatter privacy defenses.
...
...