• Corpus ID: 10446869

Scanning a Poisson Random Field for Local Signals

  title={Scanning a Poisson Random Field for Local Signals},
  author={Nancy Ruonan Zhang and Benjamin Yakir and Charlie Xia and David O. Siegmund},
  journal={arXiv: Applications},
The detection of local genomic signals using high-throughput DNA sequencing data can be cast as a problem of scanning a Poisson random field for local changes in the rate of the process. We propose a likelihood-based framework for for such scans, and derive formulas for false positive rate control and power calculations. The framework can also accommodate mixtures of Poisson processes to deal with over-dispersion. As a specific, detailed example, we consider the detection of insertions and… 

Figures and Tables from this paper

Detecting Changes in Dynamic Events Over Networks

This paper derives the likelihood ratios for point processes, which are computed efficiently via an expectation-maximization (EM) like algorithm that is parameter free and can be computed in a distributed manner, and derives a highly accurate theoretical characterization of the false-alarm rate.

Sequential (Quickest) Change Detection: Classical Results and New Directions

Some new dimensions that emerge at the intersection of sequential change detection with other areas are discussed, along with a selection of modern applications and remarks on open questions.

Detecting weak changes in dynamic events over networks

A novel change-point detection framework for multi-dimensional event data over networks is proposed, and the likelihood ratios for point processes are derived via an EM-like algorithm that is parameter-free and can be computed in a distributed fashion.



Change-point model on nonhomogeneous Poisson processes with application in copy number profiling by next-generation DNA sequencing

A flexible change-point model for inhomogeneous Poisson Processes, which arise naturally from next-generation DNA sequencing, is proposed, and score and generalized likelihood statistics for shifts in intensity functions are derived.

Scan Statistics With Weighted Observations

We examine scan statistics for one-dimensional marked Poisson processes. Such statistics tabulate the maximum weighted count of event occurrences within a window of predetermined width over all

Detecting simultaneous variant intervals in aligned sequences

This work derives an analytic approximation for the false positive error probability of a scan, which is shown by simulations to be reasonably accurate and to be robust with respect to the assumed fraction of carriers of the changes.

High-resolution mapping of copy-number alterations with massively parallel sequencing

A collection of ∼14 million aligned sequence reads from human cell lines has comparable power to detect events as the current generation of DNA microarrays and has over twofold better precision for localizing breakpoints (typically, to within ∼1 kilobase).

Computational methods for discovering structural variation with next-generation sequencing

A new generation of methods are being developed to tackle the challenges of short reads, while taking advantage of the high coverage the new sequencing technologies provide.

Summarizing and correcting the GC content bias in high-throughput sequencing

Empirical evidence strengthens the hypothesis that PCR is the most important cause of the GC bias and proposes a model that produces predictions at the base pair level, allowing strand-specific GC-effect correction regardless of the downstream smoothing or binning.

Mapping quantitative trait loci in oligogenic models.

A standard variance components model and a parametrization of the genetic effects in which the 'segregation' parameters are locally orthogonal to the 'linkage' parameters allow simple explicit expressions for the expectation of the score statistic, which is used to compare the power of different strategies.

Gaussian models for genetic linkage analysis using complete high-resolution maps of identity by descent.

The sample sizes required to detect linkage by using different classes of affected relative pairs are compared, and the problem of combining data from differentclasses of relatives is discussed.

Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing

The results demonstrate the feasibility of systematic, genome-wide characterization of rearrangements in complex human cancer genomes, raising the prospect of a new harvest of genes associated with cancer using this strategy.