Florencia G. Leonardi

Learn More
MOTIVATION A central problem in genomics is to determine the function of a protein using the information contained in its amino acid sequence. Variable length Markov chains (VLMC) are a promising class of models that can effectively classify proteins into families and they can be estimated in linear time and space. RESULTS We introduce a new algorithm,(More)
Efficient family classification of newly discovered protein sequences is a central problem in bioinformatics. We present a new algorithm , using Probabilistic Suffix Trees, which identifies equivalences between the amino acids in different positions of a motif for each family. We also show that better classification can be achieved identifying(More)
  • 1