• Publications
  • Influence
Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer.
One of the greatest challenges facing human geneticists is the identification and characterization of susceptibility genes for common complex multifactorial human diseases. This challenge is partlyExpand
  • 1,699
  • 143
  • PDF
Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions
TLDR
We have developed a multifactor dimensionality reduction (MDR) method for collapsing high-dimensional genetic data into a single dimension thus permitting interactions to be detected in relatively small sample sizes. Expand
  • 1,041
  • 69
  • PDF
Chapter 11: Genome-Wide Association Studies
TLDR
We review the key concepts underlying GWAS, including the architecture of common diseases, structure of common human genetic variation, technologies for capturing genetic information, study designs, and the statistical methods used for data analysis. Expand
  • 771
  • 52
  • PDF
Missing heritability and strategies for finding the underlying causes of complex disease
Although recent genome-wide studies have provided valuable insights into the genetic basis of human disease, they have explained relatively little of the heritability of most complex traits, and theExpand
  • 1,447
  • 50
  • PDF
A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility.
TLDR
Detecting, characterizing, and interpreting gene-gene interactions or epistasis in studies of human disease susceptibility is a mathematical and a computational challenge. Expand
  • 569
  • 41
  • PDF
Power of multifactor dimensionality reduction for detecting gene‐gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity
The identification and characterization of genes that influence the risk of common, complex multifactorial diseases, primarily through interactions with other genes and other environmental factors,Expand
  • 518
  • 26
  • PDF
Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science
TLDR
We introduce the concept of tree-based pipeline optimization for automating one of the most tedious parts of machine learning--pipeline design. Expand
  • 176
  • 26
  • PDF
A high-density admixture map for disease gene discovery in african americans.
Admixture mapping (also known as "mapping by admixture linkage disequilibrium," or MALD) provides a way of localizing genes that cause disease, in admixed ethnic groups such as African Americans,Expand
  • 441
  • 23
  • PDF
Bioinformatics challenges for genome-wide association studies
TLDR
We argue here that bioinformatics has an important role to play in addressing the complexity of the underlying genetic basis of common human diseases. Expand
  • 487
  • 23
  • PDF
TPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning
TLDR
We benchmarked the Tree-based Pipeline Optimization Tool (TPOT) v0.3 on 150 supervised classification datasets and found that it discovers machine learning pipelines that can outperform a basic machine learning analysis on several benchmarks. Expand
  • 150
  • 23
  • PDF