We describe and assess the performance of the gene finding program pretty handy annotation tool (Phat) on sequence from the malaria parasite Plasmodium falciparum. Phat is based on a generalized hidden Markov model (GHMM) similar to the models used in GENSCAN, Genie and HMMgene. In a test set of 44 confirmed gene structures Phat achieves nucleotide-level(More)
Improved search algorithms and scoring functions are required before the identification of peptide tandem MS data can be considered to be fully reliable and automatable. The development of models that can accurately predict product ion spectra from a peptide sequence would certainly help achieve this goal, but this firstly requires a better understanding of(More)
A total of 83 HIV seroconversions occurred between 1984 and 1989 in three San Francisco cohorts of homosexual and bisexual men. A nested case-control analysis was performed to assess the risk of seroconversion associated with sexual practices. Strong associations were found with total number of intercourse partners and receptive anal intercourse. Weaker,(More)
With regard to the theorem in the paper, the second part is, in general, false, and the proof, given in Section 4.2, is in error. Dr K. W. Ng and Professor A. P. Dawid have pointed out the following simple counterexample for two binary random variables x 1 , density is uniquely speci®ed by P…x 1 jx 2 † and P…x 1 †, in contradiction to the second part of the(More)
"The 1990 [U.S.] Post-Enumeration Survey (PES) stratified the population into 1,392 subpopulations called post-strata based on location, race, tenure, sex and age, in the hope that these subpopulations were homogeneous in relation to factors affecting the Census coverage....With block-level data from the PES for sites around Detroit and Texas, we are able(More)
This talk will review a little over a decade's research on applying certain stochastic models to biological sequence analysis. The models themselves have a longer history, going back over 30 years, although many novel variants have arisen since that time. The function of the models in biological sequence analysis is to summarize the information concerning(More)
