Learn More
Research in bioinformatics in the past decade has generated a large volume of textual biological data stored in databases such as MEDLINE. It takes a copious amount of effort and time, even for expert users, to manually extract useful information embedded in such a large volume of retrieved data and automated intelligent text analysis tools are increasingly(More)
This paper proposes a method for identifying protein names in biomedical texts with an emphasis on detecting protein name boundaries. We use a probabilistic model which exploits several surface clues characterizing protein names and incorporates word classes for generalization. In contrast to previously proposed methods, our approach does not rely on(More)
It is crucial to study basic principles that support adaptive and scalable retrieval functions in large networked environments such as the Web, where information is distributed among dynamic systems. We conducted experiments on decentralized IR operations on various scales of information networks and analyzed effectiveness, efficiency, and scalability of(More)
In information-filtering environments, uncertainties associated with changing interests of the user and the dynamic document stream must be handled efficiently. In this article, a filtering model is proposed that decomposes the overall task into subsystem functionalities and highlights the need for multiple adaptation techniques to cope with uncertainties.(More)
Science Several machine learning approaches have been proposed in the literature to automatically learn user interests for information filtering. However, many of them are ill-equipped to deal with changes in user interests that may occur due to changes in the user's personal and proikssionai situations. If undetected over a long time, such changes may(More)
We proposed and implemented a novel clustering algorithm called LAIR2, which has constant running time average for on-the-fly Scatter/Gather browsing [4]. Our experiments showed that when running on a single processor, the LAIR2 on-line clustering algorithm was several hundred times faster than a parallel Buckshot algorithm running on multiple processors(More)
The goal of this research is to clarify the role of document classification in information filtering. An important function of classification, in managing computational complexity, is described and illustrated in the context of an existing filtering system. A parameter called classification homogeneity is presented for analyzing unsuper-vised automated(More)