André Yoshiaki Kashiwabara

Learn More
UNLABELLED EGene is a generic, flexible and modular pipeline generation system that makes pipeline construction a modular job. EGene allows for third-party programs to be used and integrated according to the needs of distinct projects and without any previous programming or formal language experience being required. EGene comes with CoEd, a visual tool to(More)
This paper presents a novel approach to the problem of splice site prediction, by applying stochastic grammar inference. We used four grammar inference algorithms to infer 1465 grammars, and used 10-fold cross-validation to select the best grammar for each algorithm. The corresponding grammars were embedded into a classifier and used to run splice site(More)
This study reports the development and characterization of 151 sequence characterized amplified region (SCAR) markers for the seven Eimeria species that infect the domestic fowl. From this set, 84 markers are species-specific and 67 present partial specificity. The complete nucleotide sequence was derived for all markers, revealing the presence of micro-(More)
BACKGROUND A large number of probabilistic models used in sequence analysis assign non-zero probability values to most input sequences. To decide when a given probability is sufficient the most common way is bayesian binary classification, where the probability of the model characterizing the sequence family of interest is compared to that of an alternative(More)
Discrete Markovian models can be used to characterize patterns in sequences of values and have many applications in biological sequence analysis, including gene prediction, CpG island detection, alignment, and protein profiling. We present ToPS, a computational framework that can be used to implement different applications in bioinformatics analysis by(More)
  • A Gruber, Ahagon, +25 authors D J Gapped
  • 2003
DNA reads generated by large-scale sequencing projects have to be processed before further analyses in order to perform vector/primer masking, low-quality trimming and contaminant removal. This sequential processing involves several steps and the use of different computer programs, each one following its own calling convention and input/output formats. As a(More)
The identification of transcription factors binding sites (TFBS) – also called motifs – in DNA sequences is the first step to understanding how works gene regulation. Recognizing these patterns in the promoter regions of co-expressed genes is a determining key for this. Although there are several algorithms for this purpose, the problem is(More)
The development of new genomic sequencing techniques leads to a generation of a huge volume of biological data. In this context, it is important to develop new pattern recognition methods and improve its accuracy in order to support the analysis of these huge volume of data. In particular, a valuable information of the genomic sequences is its nucleotides(More)