Exact asymptotic results for the Bernoulli matching model of sequence alignment.
@article{Majumdar2005ExactAR, title={Exact asymptotic results for the Bernoulli matching model of sequence alignment.}, author={Satya N. Majumdar and Sergei Nechaev}, journal={Physical review. E, Statistical, nonlinear, and soft matter physics}, year={2005}, volume={72 2 Pt 1}, pages={ 020901 } }
Finding analytically the statistics of the longest common subsequence (LCS) of a pair of random sequences drawn from c alphabets is a challenging problem in computational evolutionary biology. We present exact asymptotic results for the distribution of the LCS in a simpler, yet nontrivial, variant of the original model called the Bernoulli matching (BM) model. We show that in the BM model, for all c , the distribution of the asymptotic length of the LCS, suitably scaled, is identical to the…
43 Citations
Exact solution of the Bernoulli matching model of sequence alignment
- Mathematics
- 2008
Through a series of exact mappings we reinterpret the Bernoulli model of sequence alignment in terms of the discrete-time totally asymmetric exclusion process with backward sequential update and step…
Sparse long blocks and the variance of the LCS
- Mathematics
- 2012
Consider two random strings having the same length and generated by two mutually independent iid sequences taking values uniformly in a common finite alphabet. We study the order of the variance of…
Bethe Ansatz in the Bernoulli matching model of random sequence alignment.
- Mathematics, Computer SciencePhysical review. E, Statistical, nonlinear, and soft matter physics
- 2008
Considering the terracelike representation of the sequence alignment problem, the Bethe Ansatz technique is applied via an exact mapping to the five-vertex model on a square lattice to reproduce the results for the averaged length of the longest common subsequence in the Bernoulli approximation.
Deviation from mean in sequence comparison with a periodic sequence
- Mathematics
- 2007
Let Ln denote the length of the longest common subsequence of two sequences of length n. We draw one of the sequences i.i.d., but the other is non- random and periodic. We prove that VAR(Ln) = ( n).…
Bethe Ansatz Solution of the Finite Bernoulli Matching Model of Sequence Alignment
- Mathematics
- 2011
We map the Bernoulli matching model of sequence alignment to the discrete-time totally asymmetric exclusion process with backward sequential update and step function initial condition. The Bethe…
On the Order of the Central Moments of the Length of the Longest Common Subsequences in Random Words
- Mathematics, Computer Science
- 2012
We investigate the order of the r-th, 1 ≤ r < +∞, central moment of the length of the longest common subsequences of two independent random words of size n whose letters are identically distributed…
A Central Limit Theorem for the Length of the Longest Common Subsequence in Random Words
- Mathematics
- 2019
Let (Xk)k≥1 and (Yk)k≥1 be two independent sequences of independent identically distributed random variables having the same law and taking their values in a finite alphabet. Let LCn be the length of…
Large deviations of the top eigenvalue of large Cauchy random matrices
- Mathematics
- 2013
We compute analytically the large deviation tails of the probability density function (pdf) of the top eigenvalue ?max? in rotationally invariant and heavy-tailed Cauchy ensembles of N ? N matrices…
A Central Limit Theorem for the Length of the Longest Common Subsequences in Random Words
- Mathematics
- 2014
Let $(X_i)_{i \geq 1}$ and $(Y_i)_{i\geq1}$ be two independent sequences of independent identically distributed random variables taking their values in a common finite alphabet and having the same…
A simple derivation of the Tracy-Widom distribution of the maximal eigenvalue of a Gaussian unitary random matrix
- Mathematics
- 2011
In this paper, we first briefly review some recent results on the distribution of the maximal eigenvalue of an (N × N) random matrix drawn from Gaussian ensembles. Next we focus on the Gaussian…
References
SHOWING 1-10 OF 75 REFERENCES
Extensive simulations for longest common subsequences . Finite size scaling, a cavity solution, and configuration space properties
- Physics
- 1998
Given two strings X and Y of N and M characters respectively, the Longest Common Subsequence (LCS) Problem asks for the longest sequence of (non-contiguous) matches between X and Y. Using extensive…
The Rate of Convergence of the Mean Length of the Longest Common Subsequence
- Mathematics
- 1994
Given two i.i.d. sequences of n letters from a finite alphabet, one can consider the length Ln of the longest sequence which is a subsequence of both the given sequences. It is known that ELn grows…
Mean-Field Approximations to the Longest Common Subsequence Problem
- Computer Science
- 1998
This work describes a systematic way of incorporating correlations among the matches of two real sequences in the calculation, and obtains closer and closer approximations to the LCS problem.
Longest common subsequences of two random sequences
- MathematicsAdvances in Applied Probability
- 1975
Given two random k-ary sequences of length n, what is f(n,k), the expected length of their longest common subsequence? This problem arises in the study of molecular evolution. We calculate f(n,k) for…
The longest common subsequence problem revisited
- Computer ScienceAlgorithmica
- 2005
This paper re-examines, in a unified framework, two classic approaches to the problem of finding a longest common subsequence (LCS) of two strings, and proposes faster implementations for both. Letl…
Long Common Subsequences and the Proximity of two Random Strings.
- Mathematics
- 1982
Let $( x_1 ,x_2 , \cdots x_n )$ and $( x'_1 ,x'_2 , \cdots x'_n , )$ be two strings from an alphabet $mathcal{A}$, and let $L_n $ denote their longest common subsequence. The probabilistic behavior…
Alignment of molecular sequences seen as random path analysis.
- Computer ScienceJournal of theoretical biology
- 1995
This work focuses on deriving a mathematically rigorous solution to RPA both in its combinatorial form and in its graphical representation, which puts DP in logical perspective under a more general conceptual framework.
Biological sequence analysis
- Biology
- 2003
This talk will review a little over a decade's research on applying certain stochastic models to biological sequence analysis, and introduce the motif models in stages, beginning from very simple, non-stochastic versions, progressively becoming more complex, until they reach modern profile HMMs for motifs.