An overview of recent developments in genomics and associated statistical methods

@article{Bickel2009AnOO,
  title={An overview of recent developments in genomics and associated statistical methods},
  author={Peter J. Bickel and James B. Brown and Haiyan Huang and Qunhua Li},
  journal={Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences},
  year={2009},
  volume={367},
  pages={4313 - 4337}
}
  • P. Bickel, James B. Brown, +1 author Qunhua Li
  • Published 13 November 2009
  • Medicine, Biology
  • Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
The landscape of genomics has changed drastically in the last two decades. Increasingly inexpensive sequencing has shifted the primary focus from the acquisition of biological sequences to the study of biological function. Assays have been developed to study many intricacies of biological systems, and publicly available databases have given rise to integrative analyses that combine information from many sources to draw complex conclusions. Such research was the focus of the recent workshop at… 

Topics from this paper

Fast assessment of the correlation between coverage-like genomic features and its statistical significance
TLDR
A fast method for calculation of kerneled correlation between two numeric annotations of the genome, where the kernel represents the mutual position of related features; e.g., a Gaussian shape corresponds to ’somewhere around’, etc.
Genome-wide study of correlations between genomic features and their relationship with the regulation of gene expression
TLDR
The criteria for the assessment of genome track inhomogeneity and correlations between two genome tracks was developed and a software package, Genome Track Analyzer, was developed based on this theory and applied to the study of correlations between CpG islands and transcription start sites in the Homo sapiens genome.
Exploring Massive, Genome Scale Datasets with the GenometriCorr Package
We have created a statistically grounded tool for determining the correlation of genomewide data with other datasets or known biological features, intended to guide biological exploration of
Biological pathway selection through nonlinear dimension reduction.
TLDR
This article proposes a 2-step procedure for identifying pathways that are related to and influence the clinical phenotype, and proposes a nonlinear dimension reduction method, which permits flexible within-pathway gene interactions as well as nonlinear pathway effects on the response.
Bioinformatics applied to gene transcription regulation.
  • G. Altobelli
  • Biology, Medicine
    Journal of molecular endocrinology
  • 2012
TLDR
This mini-review is intended as an orientation for multidisciplinary professionals, introducing a streamlined workflow in gene transcription regulation with emphasis on sequence analysis, and provides an outlook on tools and methods, selected from a host of bioinformatics resources available today.
Emerging tools and approaches to biotechnology in the omics era
TLDR
The study of transcriptome termed as transcriptomics is an emerging field that covers the total set of transcripts in an organism along with the set of all ribonucleic acid (RNA) molecules.
Computational Problems in Multi-tissue Models of Health and Disease
TLDR
It is shown how methods harnessing this integrative potential to address multi-tissue problems ranging from correlation/causal network inference to graph algorithms are ushering in an era of integrated, whole-system modeling of life processes.
VariFunNet, an integrated multiscale modeling framework to study the effects of rare non-coding variants in genome-wide association studies: Applied to Alzheimer's disease
TLDR
This work has applied VariFunNet to investigating the causal effect of rare non-coding variants on Alzheimer's disease, and suggests that multiscale modeling is a potentially powerful approach to studying causal genotype-phenotype associations.
Statistical challenges of high-dimensional data
  • I. Johnstone, D. Titterington
  • Mathematics, Medicine
    Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
  • 2009
TLDR
The difficulties that arise with high-dimensional data in the context of the very familiar linear statistical model are introduced and a taste of what can nevertheless be achieved when the parameter vector of interest is sparse, that is, contains many zero elements is given.
...
1
2
3
4
...

References

SHOWING 1-10 OF 146 REFERENCES
A statistical framework for genomic data fusion
TLDR
This paper describes a computational framework for integrating and drawing inferences from a collection of genome-wide measurements represented via a kernel function, which defines generalized similarity relationships between pairs of entities, such as genes or proteins.
Statistical significance for genomewide studies
TLDR
This work proposes an approach to measuring statistical significance in genomewide studies based on the concept of the false discovery rate, which offers a sensible balance between the number of true and false positives that is automatically calibrated and easily interpreted.
Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project
TLDR
Functional data from multiple, diverse experiments performed on a targeted 1% of the human genome as part of the pilot phase of the ENCODE Project are reported, providing convincing evidence that the genome is pervasively transcribed, such that the majority of its bases can be found in primary transcripts.
The impact of next-generation sequencing technology on genetics.
  • E. Mardis
  • Biology, Medicine
    Trends in genetics : TIG
  • 2008
Advanced sequencing technologies and their wider impact in microbiology
  • N. Hall
  • Medicine, Biology
    Journal of Experimental Biology
  • 2007
TLDR
While this review will concentrate on microorganisms, many of the important arguments about the need to measure and understand variation at the species, population and ecosystem level will hold true for many other biological systems.
Interrelating different types of genomic data, from proteome to secretome: 'oming in on function.
TLDR
The term "translatome" is suggested to describe the members of the proteome weighted by their abundance, and the "functome" to describe all the functions carried out by these in the cellular contents of the genome.
Computation and analysis of genomic multi-sequence alignments.
  • M. Blanchette
  • Biology, Medicine
    Annual review of genomics and human genetics
  • 2007
TLDR
The key algorithmic ideas in use today are introduced, and publicly available resources for computing, accessing, and visualizing genomic alignments are identified and directions for future improvements are suggested.
Whole-genome re-sequencing.
  • D. Bentley
  • Medicine, Biology
    Current opinion in genetics & development
  • 2006
Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids
TLDR
This book gives a unified, up-to-date and self-contained account, with a Bayesian slant, of such methods, and more generally to probabilistic methods of sequence analysis.
Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling
TLDR
This work shows how the perturbation of carefully chosen genes in a microarray experiment can be used in conjunction with a reverse engineering algorithm to reveal the architecture of an underlying gene regulatory network.
...
1
2
3
4
5
...