ApiEST-DB: analyzing clustered EST data of the apicomplexan parasites

  title={ApiEST-DB: analyzing clustered EST data of the apicomplexan parasites},
  author={Li Li and Jonathan Crabtree and Steve Fischer and Deborah F. Pinney and Christian J. Stoeckert and L. David Sibley and David S. Roos},
  journal={Nucleic acids research},
  volume={32 Database issue},
ApiEST-DB (http://www.cbil.upenn.edu/paradbs-servlet/) provides integrated access to publicly available EST data from protozoan parasites in the phylum Apicomplexa. The database currently incorporates a total of nearly 100,000 ESTs from several parasite species of clinical and/or veterinary interest, including Eimeria tenella, Neospora caninum, Plasmodium falciparum, Sarcocystis neurona and Toxoplasma gondii. To facilitate analysis of these data, EST sequences were clustered and assembled to… 

Figures from this paper

ApiDB: integrated resources for the apicomplexan bioinformatics resource center

ApiDB () represents a unified entry point for the NIH-funded Apicomplexan Bioinformatics Resource Center (BRC) that integrates numerous database resources and multiple data types. The phylum

TcruziDB: an integrated, post-genomics community resource for Trypanosoma cruzi

TcruziDB houses the recently published assembled genomic contigs and annotation provided by the genome consortium in a relational database supported by the Genomics Unified Schema (GUS) architecture.

Work ow-based systematic design of high throughput genome annotation

A comprehensive annotation system named as “WAGA” (Workflow-based Automatically Genome Annotation) was built and applied to the E. tenella genome, the causative agent of human malaria, which has been extensively annotated.

PlasmoDB: The Plasmodium Genome Resource

The Plasmodium Genome Database provides the user with a variety of analysis tools for examining and extracting information from the genome and predicted proteome, using BLAST, electronic PCRs, defined motif searches, and tools for the analysis of microarray and proteomics data.

TBestDB: a taxonomically broad database of expressed sequence tags (ESTs)

The TBestDB database is opened to the research community for free processing, annotation, interspecies comparisons and GenBank submission of EST data generated in individual laboratories.


This dissertation exploits phylogenomic approaches to identify genes and gene families likely to be important in the biology of apicomplexan parasites, including Plasmodium and Toxoplasma, and explored the significance of lateral gene transfer and gene duplication as sources of evolutionary novelty.


Five of the important criteria that need to be met before radical taxonomic changes are made, in relation to phylogenetic analyses of the Apicomplexa are considered, and at least four of these criteria indicate that the prospects for elucidating the phylogeny and taxonomy of the APS are not good.

CryptoDB: a Cryptosporidium bioinformatics resource update

The database, CryptoDB (), is a community bioinformatics resource for the AIDS-related apicomplexan-parasite, Cryptosporidium. CryptoDB integrates whole genome sequence and annotation with expressed

Prospects for elucidating the phylogeny of the Apicomplexa.

Five of the important criteria that need to be met before radical taxonomic changes are made, in relation to phylogenetic analyses of the Apicomplexa are considered, and at least four of these criteria indicate that the prospects for elucidating the phylogeny and taxonomy of the APS are not good.

Composite genome map and recombination parameters derived from three archetypal lineages of Toxoplasma gondii

A high frequency of closely adjacent, apparent double crossover events that may represent gene conversions and large regions of genetic homogeneity among the archetypal clonal lineages are detected, reflecting the relatively few genetic outbreeding events that have occurred since their recent origin are detected.



Gene discovery in the apicomplexa as revealed by EST sequencing and assembly of a comparative gene database.

An interesting class of genes that are confined to members of this phylum and not shared by plants, animals, or fungi, was identified and likely mediate the novel biological features of members of the Apicomplexa.

PlasmoDB: the Plasmodium genome resource. A database integrating experimental and computational data

The goal of PlasmoDB is to facilitate utilization of the vast quantities of genomic-scale data produced by the global malaria research community.

CryptoDB: the Cryptosporidium genome resource

The CryptoDB database, like other apicomplexan parasite databases, has been built utilizing the PlasmoDB model and contains approximately 19 million bases of genome sequence for the H and IOWA strains and an additional approximately 24 millions bases of GSS and EST sequence obtained from other sources.

ToxoDB: accessing the Toxoplasma gondii genome

ToxoDB was designed to provide a central point of access for all available T. gondii data, and a variety of data mining tools useful for the analysis of unfinished, un-annotated draft sequence during the early phases of the genome project.

ESTAnnotator: a tool for high throughput EST annotation

In high throughput sequence analysis, it is often necessary to combine the results of contemporary bioinformatics tools, because no individual tool alone computes all the requested information.

CDD: a curated Entrez database of conserved domain alignments

The Conserved Domain Database (CDD), which mirrors the publicly available domain alignment collections SMART and PFAM, and now also contains alignment models curated at NCBI, is now indexed as a separate database within the Entrez system and linked to other Entrez databases such as MEDLINE(R).

ProDom: Automated Clustering of Homologous Domains

The ProDom database is a comprehensive set of protein domain families automatically generated from the SWISS-PROT and TrEMBL sequence databases that makes it particularly useful to help sustain the growth of InterPro.

Progress in taxonomy of the Apicomplexan protozoa.

The numbers of named species and genera of apicomplexan protozoa of each group known in 1850, 1875, 1900, 1925, 1950, 1975, and 1987 are given.

Kingdom protozoa and its 18 phyla.

The demarcation of protist kingdoms is reviewed, a complete revised classification down to the level of subclass is provided for the kingdoms Protozoa, Archezoa, and Chromista, and the phylogenetic

ESTAP-an automated system for the analysis of EST data

The EST Analysis Pipeline (ESTAP) is a set of analytical procedures that automatically verify, cleanse, store and analyze ESTs generated on high-throughput platforms. It uses a relational database to