Learn More
Traditional association mining algorithms use a strict definition of support that requires every item in a frequent itemset to occur in each supporting transaction. In real-life datasets, this limits the recovery of frequent itemset patterns as they are fragmented due to random noise and other errors in the data. Hence, a number of methods have been(More)
The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across(More)
BACKGROUND Gene clustering plays an important role in the organization of the bacterial chromosome and several mechanisms have been proposed to explain its extent. However, the controversies raised about the validity of each of these mechanisms remind us that the cause of this gene organization remains an open question. Models proposed to explain clustering(More)
BACKGROUND The recent availability of dabigatran, a novel oral anticoagulant, provided a new treatment option for stroke prevention in atrial fibrillation beyond warfarin, the main therapy for years. Little is known about their real-world comparative effectiveness and safety, even less among patient demographic and clinical subgroups. METHODS AND RESULTS(More)
Pseudofam (http://pseudofam.pseudogene.org) is a database of pseudogene families based on the protein families from the Pfam database. It provides resources for analyzing the family structure of pseudogenes including query tools, statistical summaries and sequence alignments. The current version of Pseudofam contains more than 125,000 pseudogenes identified(More)
In this paper, we study methods to identify differential coexpression patterns in case-control gene expression data. A differential coexpression pattern consists of a set of genes that have substantially different levels of coherence of their expression profiles across the two sample-classes, i.e., highly coherent in one class, but not in the other.(More)
Matching of partial fingerprints has important applications in both biometrics and forensics. It is well-known that the accuracy of minutiae-based matching algorithms dramatically decrease as the number of available minutiae decreases. When singular structures such as core and delta are unavailable, general ridges can be utilized. Some existing highly(More)
The genome has often been called the operating system (OS) for a living organism. A computer OS is described by a regulatory control network termed the call graph, which is analogous to the transcriptional regulatory network in a cell. To apply our firsthand knowledge of the architecture of software systems to understand cellular design principles, we(More)