Rigging Research Results by Manipulating Top Websites Rankings
@article{Pochat2018RiggingRR, title={Rigging Research Results by Manipulating Top Websites Rankings}, author={Victor Le Pochat and Tom van Goethem and Wouter Joosen}, journal={ArXiv}, year={2018}, volume={abs/1806.01156} }
Researchers often use rankings of popular websites when measuring security practices, evaluating defenses or analyzing ecosystems. However, little is known about the data collection and processing methodologies of these rankings. In this paper, we uncover how both inherent properties and vulnerabilities to adversarial manipulation of these rankings may affect the conclusions of security studies. To that end, we compare four main rankings used in recent studies in terms of their agreement with…
Figures and Tables from this paper
10 Citations
Clustering and the Weekend Effect: Recommendations for the Use of Top Domain Lists in Security Research
- Computer SciencePAM
- 2019
It is found that the weekend effect in Alexa and Umbrella causes these rankings to change their geographical diversity between the workweek and the weekend, and up to 91% of ranked domains appear in alphabetically sorted clusters containing up to 87k domains of presumably equivalent popularity.
A Long Way to the Top: Significance, Structure, and Stability of Internet Top Lists
- PhysicsInternet Measurement Conference
- 2018
It is found that top lists generally overestimate results compared to the general population by a significant margin, often even an order of magnitude, and some top lists have surprising change characteristics, causing high day-to-day fluctuation and leading to result instability.
Tracking and Tricking a Profiler: Automated Measuring and Influencing of Bluekai's Interest Profiling
- Computer ScienceWPES@CCS
- 2018
A system to analyze online profiling as a black box by simulating web browsing sessions based on links posted to Reddit shows that only a fraction of websites influence the interests assigned to a session's profile, that the profiles themselves are very noisy, and that identical browsing behavior results in different profiles.
We Value Your Privacy ... Now Take Some Cookies: Measuring the GDPR's Impact on Web Privacy
- Computer ScienceNDSS
- 2019
It is concluded that the GDPR is making the web more transparent, but there is still a lack of both functional and usable mechanisms for users to consent to or deny processing of their personal data on the Internet.
Privacy Policies Across the Ages: Content and Readability of Privacy Policies 1996-2021
- Computer ScienceArXiv
- 2022
A large-scale longitudinal corpus of privacy policies from 1996 to 2021 is collected and analyzed to speculate why privacy policies are rarely read and propose changes that would make privacy policies serve their readers instead of their writers.
Exploring Malware Behavior of Webpages Using Machine Learning Technique: An Empirical Study
- Computer ScienceElectronics
- 2020
To improve the feature selection accuracy, a machine learning technique called bagging is employed using the Weka program and random tree was applied because it can handle similar types of data such as bagging, but better than other classifiers because it is faster and more accurate.
Measuring Cookies and Web Privacy in a Post-GDPR World
- Computer SciencePAM
- 2019
In response, the European Union has adopted the General Data Protection Regulation (GDPR), a legislative framework for data protection empowering individuals to control their data. Since its adoption…
Measurement-based Experiments on the Mobile Web: A Systematic Mapping Study
- Computer ScienceEASE
- 2021
This study benefits researchers and practitioners by presenting common techniques, empirical practices, and tools to properly conduct measurement-based experiments on the mobile Web.
Innocent Until Proven Guilty (IUPG): Building Deep Learning Models with Embedded Robustness to Out-Of-Distribution Content
- Computer Science2021 IEEE Security and Privacy Workshops (SPW)
- 2021
This work proposes a novel learning framework called Innocent Until Proven Guilty which prototypes training data clusters or classes within the input space while uniquely leveraging noise and inherently random classes to discover noise-resistant, uniquely identifiable features of the modeled classes.
Defining the linkage specialist role in the HIV care cascade
- MedicineJournal of HIV/AIDS & Social Services
- 2019
The most frequently cited duties, knowledge, skills and abilities required of linkage specialists in employment advertisements and described in peer-reviewed literature are identified.
References
SHOWING 1-10 OF 90 REFERENCES
A Long Way to the Top: Significance, Structure, and Stability of Internet Top Lists
- PhysicsInternet Measurement Conference
- 2018
It is found that top lists generally overestimate results compared to the general population by a significant margin, often even an order of magnitude, and some top lists have surprising change characteristics, causing high day-to-day fluctuation and leading to result instability.
Security Challenges in an Increasingly Tangled Web
- Computer ScienceWWW
- 2017
The current state of web dependencies is investigated and two security challenges associated with the increasing reliance on external services are explored: the expanded attack surface associated with serving unknown, implicitly trusted third-party content and how the increased set of external dependencies impacts HTTPS adoption.
Large-Scale Security Analysis of the Web: Challenges and Findings
- Computer ScienceTRUST
- 2014
This paper reports on the state of security for more than 22,000 websites that originate in 28 EU countries and explores the adoption of countermeasures that can be used to defend against common attacks and serve as indicators of "security consciousness".
Exposing the Hidden Web: An Analysis of Third-Party HTTP Requests on 1 Million Websites
- Computer ScienceArXiv
- 2015
It is revealed that a handful of U.S. companies receive the vast bulk of user data, and roughly 1 in 5 websites are potentially vulnerable to known National Security Agency spying techniques at the time of analysis.
Measuring HTTPS Adoption on the Web
- Computer ScienceUSENIX Security Symposium
- 2017
This work gathers metrics to benchmark the status and progress of HTTPS adoption on the Web in 2017, and surveys server support for HTTPS among top and long-tail websites to gain insight into the current state of the HTTPS ecosystem.
Aiding the Detection of Fake Accounts in Large Scale Social Online Services
- Computer ScienceNSDI
- 2012
A new tool in the hands of OSN operators, which relies on social graph properties to rank users according to their perceived likelihood of being fake (SybilRank), which is computationally efficient and can scale to graphs with hundreds of millions of nodes, as demonstrated by the Hadoop prototype.
Knowing your enemy: understanding and detecting malicious web advertising
- Computer ScienceCCS
- 2012
A large-scale study through analyzing ad-related Web traces crawled over a three-month period reveals the rampancy of malvertising: hundreds of top ranking Web sites fell victims and leading ad networks such as DoubleClick were infiltrated.
Peeking Through the Cloud: DNS-Based Estimation and Its Applications
- Computer ScienceACNS
- 2008
A new estimation technique that uses DNS cache probing to infer the density of clients accessing a given service, which is less invasive as it does not reveal user-specific traits, and is more robust against manipulation.
Apples, oranges and hosting providers: Heterogeneity and security in the hosting market
- Computer ScienceNOMS 2016 - 2016 IEEE/IFIP Network Operations and Management Symposium
- 2016
Hosting services are associated with various security threats, yet the market has barely been studied empirically. Most security research has relied on routing data and equates providers with…
An Automated Approach to Auditing Disclosure of Third-Party Data Collection in Website Privacy Policies
- BusinessWWW
- 2018
This study presents the first large-scale audit of disclosure of third-party data collection in website privacy policies, indicating that current implementations of "notice and choice" fail to provide notice or respect choice.