Learn More
Structural information (such as layout and look-and-feel) has been extensively used in the literatuce for extraction of interesting or relevant data, efficient storage, and query optimization. Traditionally, tree models (such as DOM trees) have been used to represent structural information, especially in the case of HTML and XML documents. However,(More)
This paper proposes an algorithm to hierarchically cluster documents. Each cluster is actually a cluster of documents and an associated cluster of words, thus a document-word co-cluster. Note that, the vector model for documents creates the document-word matrix, of which every co-cluster is a submatrix. One would intuitively expect a submatrix made up of(More)
This paper introduces <i>generalised disjunctive association rules</i> such as "People who buy bread also buy butter jam", and "People who buy <b>either</b> raincoats <b>or</b> umbrellas also buy flashlights". A <i>generalised disjunctive association rule</i> allows the disjunction of conjuncts, "People who buy jackets also buy bow ties <b>or</b> neckties(More)
Advances in studies of microRNA (miRNA) expression and function in smooth muscles illustrate important effects of small noncoding RNAs on cell proliferation, hypertrophy and differentiation. An emerging theme in miRNA research in a variety of cell types including smooth muscles is that miRNAs regulate protein expression networks to fine tune phenotype. Some(More)
Nearly two decades of research in the area of Inductive Logic Programming (ILP) have seen steady progress in clarifying its theoretical foundations and regular demonstrations of its applicability to complex problems in very diverse domains. These results are necessary, but not sufficient, for ILP to be adopted as a tool for data analysis in an era of very(More)
High mobility group protein 1 (HMGB1) interacts with DNA and chromatin to influence the regulation of transcription, DNA repair and recombination. We show that HMGB1 alters the structure and stability of the canonical nucleosome (N) in a nonenzymatic, ATP-independent manner. Although estrogen receptor (ER) does not bind to its consensus estrogen response(More)
Entity annotation involves attaching a label such as 'name' or 'organization' to a sequence of tokens in a document. All the current rule-based and machine learning-based approaches for this task operate at the document level. We present a new and generic approach to entity annotation which uses the inverse index typically created for rapid keyword based(More)
During normal lung development and in lung diseases structural cells in the lungs adapt to permit changes in lung function. Fibroblasts, myofibroblasts, smooth muscle, epithelial cells, and various progenitor cells can all undergo phenotypic modulation. In the pulmonary vasculature occlusive vascular lesions that occur in severe pulmonary arterial(More)
Lack of supervision in clustering algorithms often leads to clusters that are not useful or interesting to human reviewers. We investigate if supervision can be automatically transferred to a clustering task in a target domain, by providing a relevant supervised partitioning of a dataset from a different source domain. The target clustering is made more(More)