Learn More
Primate-specific segmental duplications are considered important in human disease and evolution. The inability to distinguish between allelic and duplication sequence overlap has hampered their characterization as well as assembly and annotation of our genome. We developed a method whereby each public sequence is analyzed at the clone level for(More)
We introduce a simple, broadly applicable method for obtaining estimates of nucleotide diversity from genomic shotgun sequencing data. The method takes into account the special nature of these data: random sampling of genomic segments from one or more individuals and a relatively high error rate for individual reads. Applying this method to data from the(More)
The high degree of similarity between the mouse and human genomes is demonstrated through analysis of the sequence of mouse chromosome 16 (Mmu 16), which was obtained as part of a whole-genome shotgun assembly of the mouse genome. The mouse genome is about 10% smaller than the human genome, owing to a lower repetitive DNA content. Comparison of the(More)
MicroRNAs play a role in regulating diverse biological processes and have considerable utility as molecular markers for diagnosis and monitoring of human disease. Several technologies are available commercially for measuring microRNA expression. However, cross-platform comparisons do not necessarily correlate well, making it difficult to determine which(More)
DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical(More)
The current 'isolate, inactivate, inject' vaccine development strategy has served the field of vaccinology well, and such empirical vaccine candidate development has even led to the eradication of smallpox. However, such an approach suffers from limitations, and as an empirical approach, does not fully utilize our knowledge of immunology and genetics. A(More)
Type-2 Diabetes Mellitus is a growing epidemic that often leads to severe complications. Effective preventive measures exist and identifying patients at high risk of diabetes is a major health-care need. The use of association rule mining (ARM) is advantageous, as it was specifically developed to identify associations between risk factors in an(More)
Lung adenocarcinomas from never smokers account for approximately 15 to 20% of all lung cancers and these tumors often carry genetic alterations that are responsive to targeted therapy. Here we examined mutation status in 10 oncogenes among 89 lung adenocarcinomas from never smokers. We also screened for oncogene fusion transcripts in 20 of the 89 tumors by(More)
Early detection of patients with elevated risk of developing diabetes mellitus is critical to the improved prevention and overall clinical management of these patients. We aim to apply association rule mining to electronic medical records (EMR) to discover sets of risk factors and their corresponding subpopulations that represent patients at particularly(More)
Associative classification is a predictive modeling technique that constructs a classifier based on class association rules (also known as predictive association rules; PARs). PARs are association rules where the consequence of the rule is a class label. Associative classification has gained substantial research attention because it successfully joins the(More)