Vassilios S. Verykios

Learn More
Often, in the real world, entities have two or more representations in databases. Duplicate records do not share a common key and/or they contain errors that make duplicate matching a difcult task. Errors are introduced as the result of transcription errors, incomplete information, lack of standard formats or any combination of these factors. In this(More)
Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms, have(More)
Data mining technology has given us new capabilities to identify correlations in large data sets. This introduces risks when the data is to be made public, but the correlations are private. We introduce a method for selectively removing individual values from a database to prevent the discovery of a set of rules, while preserving the data for other(More)
Large repositories of data contain sensitive information which must be protected against unauthorized access. The protection of the conndentiality of this information has been a long-term goal for the database security research community and the government statistical agencies. Recent advances, in data mining and machine learning algorithms, have increased(More)
Data cleaning is a vital process that ensures the quality of data stored in real-world databases. Data cleaning problems are frequently encountered in many research areas, such as knowledge discovery in databases, data ware-housing, system integration and e-services. The process of identifying the record pairs that represent the same entity (duplicate(More)
We provide here an overview of the new and rapidly emerging research area of privacy preserving data mining. We also propose a classification hierarchy that sets the basis for analyzing the work which has been performed in this context. A detailed review of the work accomplished in this area is also given, along with the coordinates of each work to the(More)
The current trend in the application space towards systems of loosely coupled and dynamically bound components that enables just-in-time integration jeopardizes the security of information that is shared between the broker, the requester, and the provider at runtime. In particular, new advances in data mining and knowledge discovery, that allow for the(More)
The rapid growth of transactional data brought, soon enough, into attention the need of its further exploitation. In this paper, we investigate the problem of securing sensitive knowledge from being exposed in patterns extracted during association rule mining. Instead of hiding the produced rules directly, we decide to hide the sensitive frequent itemsets(More)