A machine learning solution to assess privacy policy completeness: (short paper)

@inproceedings{Costante2012AML,
  title={A machine learning solution to assess privacy policy completeness: (short paper)},
  author={Elisa Costante and Yuanhao Sun and Milan Petkovic and Jerry den Hartog},
  booktitle={WPES '12},
  year={2012}
}
A privacy policy is a legal document, used by websites to communicate how the personal data that they collect will be managed. By accepting it, the user agrees to release his data under the conditions stated by the policy. Privacy policies should provide enough information to enable users to make informed decisions. Privacy regulations support this by specifying what kind of information has to be provided. As privacy policies can be long and difficult to understand, users tend not to read them… 

Figures and Tables from this paper

A Machine-Learning Based Approach for Measuring the Completeness of Online Privacy Policies
TLDR
An automated approach for assisting users to evaluate online privacy policies based on completeness, which employs a machine-learning based approach to predict a completeness score for the privacy policy that can be used by the user to assess the risk to their privacy.
What websites know about you : privacy policy analysis using information extraction
TLDR
This paper proposes a solution which automatically analyzes privacy policy text and shows what personal information is collected, based on the use of Information Extraction techniques and represents a step towards the more ambitious aim of automated grading of privacy policies.
I Read but Don't Agree: Privacy Policy Benchmarking using Machine Learning and the EU GDPR
TLDR
A machine learning based approach to summarize the rather long privacy policy into short and condensed notes following a risk-based approach and using the European Union (EU) General Data Protection Regulation (GDPR) aspects as assessment criteria is proposed.
Automatic Extraction of Opt-Out Choices from Privacy Policies
TLDR
This paper describes machine learning approaches for extracting instances containing opt-out hyperlinks and evaluates the proposed methods using the OPP-115 Corpus, a dataset of annotated privacy policies.
What Websites Know About You
TLDR
This paper proposes a solution which automatically analyzes privacy policy text and shows what personal information is collected, based on the use of Information Extraction techniques and represents a step towards the more ambitious aim of automated grading of privacy policies.
Establishing a Strong Baseline for Privacy Policy Classification
TLDR
This paper presents three different models that are able to assign pre-defined categories to privacy policy paragraphs, using supervised machine learning, and shows that this approach outperforms state-of-the-art by 5% over comparable and previously-reported F1 values.
The Creation and Analysis of a Website Privacy Policy Corpus
TLDR
A corpus of 115 privacy policies with manual annotations for 23K fine-grained data practices is introduced and the process of using skilled annotators and a purpose-built annotation tool to produce the data is described.
Unifying Privacy Policy Detection
TLDR
A toolchain to process website privacy policies and prepare them for research purposes is developed, using natural language processing and machine learning to automatically determine whether given texts are privacy or cookie policies.
AMARYLLIS: A User-Centric Information System for Automated Privacy Policy Analysis
TLDR
The architecture of AMARYLLIS (AutoMAted pRivacY poLicy anaLysIS), a user-centric information system, as well as its use cases, are presented, applying a Design Science Research methodology.
Challenges in Classifying Privacy Policies by Machine Learning with Word-based Features
TLDR
This paper classifies sentences in privacy policies with category labels, using popular machine learning algorithms, such as a naive Bayes classifier, and adopts words as the features of those algorithms.
...
1
2
3
4
5
...

References

SHOWING 1-10 OF 69 REFERENCES
What Websites Know About You
TLDR
This paper proposes a solution which automatically analyzes privacy policy text and shows what personal information is collected, based on the use of Information Extraction techniques and represents a step towards the more ambitious aim of automated grading of privacy policies.
Ranking Privacy Policy
TLDR
A mathematical model is suggested which will assign a privacy score/rank to a privacy policy, after analyzing the different components of that company's privacy statement, which can be one criterion to decide whether to continue using a certain Web site.
An empirical study of natural language parsing of privacy policy rules using the SPARCLE policy workbench
TLDR
The successful implementation of the parsing capabilities that are part of the functional version of the SPARCLE authoring utility are presented, including a set of grammars which execute on a shallow parser that are designed to identify the rule elements in privacy policy rules.
The platform for privacy preferences
TLDR
It is believed users' confidence in online transactions will increase when they are presented with meaningful information and choices about Web site privacy practices, and P3P is not a silver bullet; it is complemented by other technologies as well as regulatory and self-regulatory approaches to privacy.
Usable security and privacy: a case study of developing privacy management tools
TLDR
The research reported here describes the efforts to design a privacy management workbench which facilitates privacy policy authoring, implementation, and compliance monitoring, and iteratively designing and validating a prototype with target users for flexible privacy technologies.
Use of a P3P user agent by early adopters
TLDR
It is found that a large proportion of AT&T Privacy Bird users began reading privacy policies more often and being more proactive about protecting their privacy as a result of using this software.
A Privacy Assessment Approach for Serviced Oriented Architecture Application
TLDR
An approach is designed and implemented for a privacy policy checker engine that automatically verifies and certifies a Web service application based on the levels of overall privacy principle compliance and privacy statement compliance.
PPMLP: A Special Modeling Language Processor for Privacy Policies
TLDR
Results are presented on a special privacy policy modeling language processor (PPMLP) based on service oriented architecture (SOA) for an organization to model the structure and contents of private policy they want through a meta type of privacy policy specifications.
Privacy Policy Referencing
TLDR
This article describes a new approach called Privacy Policy Referencing, and outlines the technical and the complementary legal framework that needs to be established to support it.
The Effect of Online Privacy Information on Purchasing Behavior: An Experimental Study
TLDR
This study indicates that when privacy information is made more salient and accessible, some consumers are willing to pay a premium to purchase from privacy protective websites, which suggests that businesses may be able to leverage privacy protection as a selling point.
...
1
2
3
4
5
...