István Hegedüs

Learn More
Machine learning over fully distributed data poses an important problem in peer-to-peer (P2P) applications. In this model we have one data record at each network node, but without the possibility to move raw data due to privacy considerations. For example, user profiles, ratings, history, or sensor readings can represent this case. This problem is(More)
Fully distributed data mining algorithms build global models over large amounts of data distributed over a large number of peers in a network, without moving the data itself. In the area of peer-to-peer (P2P) networks, such algorithms have various applications in P2P social networking , and also in trackerless BitTorrent communities. The difficulty of the(More)
Offering personalized recommendation as a service in fully distributed applications such as file-sharing, distributed search, social networking, P2P television , etc, is an increasingly important problem. In such networked environments recommender algorithms should meet the same performance and reliability requirements as in centralized services. To achieve(More)
OBJECTIVE In this study the authors describe the system submitted by the team of University of Szeged to the second i2b2 Challenge in Natural Language Processing for Clinical Data. The challenge focused on the development of automatic systems that analyzed clinical discharge summary texts and addressed the following question: "Who's obese and what(More)
The multi-armed bandit problem has attracted remarkable attention in the machine learning community and many efficient algorithms have been proposed to handle the so-called exploitation-exploration dilemma in various bandit setups. At the same time, significantly less effort has been devoted to adapting bandit algorithms to particular architec-tures, such(More)
—In fully distributed networks data mining is an important tool for monitoring, control, and for offering person-alized services to users. The underlying data model can change as a function of time according to periodic (daily, weakly) patterns, sudden changes, or long term transformations of the environment or the system itself. For a large space of the(More)
—Peer-to-peer file-sharing has been increasingly popular in the last decade. In most cases file-sharing communities provide only minimal functionality, such as search and download. Extra features such as recommendation are difficult to implement because users are typically unwilling to provide sufficient rating information for the items they download. For(More)
We focus on the problem of data mining over large-scale fully distributed databases, where each node stores only one data record. We assume that a data record is never allowed to leave the node it is stored at. Possible motivations for this assumption include privacy or a lack of a centralized infrastructure. To tackle this problem, earlier we proposed the(More)
In this paper, we shall introduce the problem of free-text-tagging of online news archives. From an application point of view, it has many benefits for online news portals and on the other hand, the task has unique characteristics compared to existing approaches for free-text-tagging. We shall describe our system, which was developed for the archive(More)
Applying sophisticated machine learning techniques on fully distributed data is increasingly important in many applications like distributed recommender systems or spam filters. In this type of networked environment the data model can change dynamically over time (concept drift). Identifying when concept drift occurred is a key for several drift handling(More)