Learn More
Extracting information from semistructured documents is a very hard task, and is going to become more and more critical as the amount of digital information available on the Internet grows. Indeed, documents are often so large that the data set returned as answer to a query may be too big to convey interpretable knowledge. In this paper, we describe an(More)
Supporting aggregates in recursive logic rules represents a very important problem for Datalog. To solve this problem, we propose a simple extension, called Datalog $$^{FS}\,$$ (Datalog extended with frequency support goals), that supports queries and reasoning about the number of distinct variable assignments satisfying given goals, or conjunctions of(More)
The advent of the Big Data challenge has stimulated research on methods to deal with the problem of managing <i>data abundance</i>. Many approaches have been developed, but for the most part, they attack one specific side of the problem: e.g. efficient querying, analysis techniques that summarize data or reduce its dimensionality, data visualization, etc.(More)
The increasing amount of very large XML datasets available to casual users is a most challenging problem for our community, and calls for an appropriate support to efficiently gather knowledge from these data. Data mining, already widely applied to extract frequent correlations of values from both structured and semi-structured datasets, is the appropriate(More)
Frequent constraint violations on the data stored in a database may suggest that the semantics of the represented reality is changing. In this work we propose a methodology and a tool, based on data mining, to maintain the integrity constraints specified at design time, in order to adjust them to the evolutions of the modeled reality that may occur during(More)
The advent of the Big Data challenge has stimulated research on methods and techniques to deal with the problem of managing data abundance. Many approaches have been developed, but for the most part, they attack one specific side of the problem: e.g. efficient querying, analysis techniques that summarize data or reduce its dimensionality, data(More)