Gclust: Genome-Wide Clustering of Protein Sequences for Identification of Photosynthesis-Related Genes Resulting from Massive Horizontal Gene Transfer

Abstract

Use of whole genomic data in genome phylogeny is an alternative to the conventional sequence-based phylogeny. In sequence-based phylogeny, single-copy genes with well-defined orthologues in every species are used. This excludes the use of many physiologically important genes in multigene families. The genes that are present in limited taxa are also excluded. The use of all sequence data in genome comparison is expected to alleviate bias due to the use of such limited data. There are different ways of whole genome comparison. One is the use of orthologues that are, for example, defined by bidirectional best hit. Another is the use of homologue groups [2]. I reported in a previous GIW meeting an attempt of whole genome comparison of six species of cyanobacteria and a green plant, as well as non-photosynthetic organisms [3].

1 Figure or Table

Cite this paper

@inproceedings{Sato2003GclustGC, title={Gclust: Genome-Wide Clustering of Protein Sequences for Identification of Photosynthesis-Related Genes Resulting from Massive Horizontal Gene Transfer}, author={Naoki Sato}, year={2003} }