A new method of sequence analysis, using a weight array method (WAM), which generalizes the traditional Staden weight matrix method (WMM), is proposed. With the help of a statistical mechanical model, the discriminant function is identified with the energy function describing macromolecular interactions. The method is applied to the study of 5'-splice… (More)

In this paper, we review the literature on statistical long-range correlation in DNA sequences. We examine the current evidence for these correlations, and conclude that a mixture of many length scales including some relatively long ones in DNA sequences is responsible for the observed 1=f-like spectral component. We note the complexity of the correlation… (More)

- T Mizukami, W I Chang, I Garkavtsev, N Kaplan, D Lombardi, T Matsumoto +4 others
- Cell
- 1993

We present the application of a nonrandom sequence-tagged site (STS) content detection method in mapping an entire genome, that of fission yeast. The novelty of our strategy is in the use of STS probes made from both ends of cosmid clones, selected on the basis of "sample without replacement" (only library clones that show no previous positive hybridization… (More)

High throughput sequencing methods are widely used in analyses of microbial diversity, but are generally applied to small numbers of samples, which precludes characterization of patterns of microbial diversity across space and time. We have designed a primer-tagging approach that allows pooling and subsequent sorting of numerous samples, which is directed… (More)

As part of our effort to construct a physical map of the genome of the fission yeast Schizosaccharomyces pombe, we have made theoretical predictions for the progress expected, as measured by the expected length fraction of island coverage and by the expected properties of the anchored islands such as the number and the size of islands. Our experimental… (More)

A common practice among researchers performing linkage studies is the use of equal allele frequencies as input when reporting p-values from computer linkage programs such as S.A.G.E. SIBPAL. Our results, using 5,000 sets from a uniform-prior distribution of allele frequencies, showed that such input may be problematic. Further, we found that the S.A.G.E.… (More)

