Parasitism as the main factor shaping peptide vocabularies in current organisms

SUMMARY Self/non-self-discrimination by vertebrate immune systems is based on the recognition of the presence of peptides in proteins of a parasite that are not contained in the proteins of a host. Therefore, a reduction of the number of ‘words’ in its own peptide vocabulary could be an efficient evolutionary strategy of parasites for escaping recognition. Here, we compared peptide vocabularies of 30 endoparasitic and 17 free-living unicellular organisms and also eight multicellular parasitic… 
Thus spoke peptides: SARS-CoV-2 spike gene evolved in humans and then shortly in rats while the rest of its genome in horseshoe bats and then in treeshrews
A new method is used to search for the original host of the ancestor of the SARS-CoV-2 virus and for the donor of its gene for the spike protein, the molecule responsible for binding to and entering human cells, suggesting that the ancestral coronavirus adapted to bats, but the spike gene donor was adapted to humans.
Search for Human-Specific Proteins Based on Availability Scores of Short Constituent Sequences: Identification of a WRWSH Protein in Human Testis
A practical application of SCS-based methods for protein searches is highlighted and possible contributions of SNP variants and alternative splicing of FAM75 to human evolution are suggested.
Nonself Mutations in the Spike Protein Suggest an Increase in the Antigenicity and a Decrease in the Virulence of the Omicron Variant of SARS-CoV-2
The present results suggest that the Omicron variant has evolved to have higher antigenicity and less virulence in humans despite increased infectivity and transmissibility.
Possible Critical Role of Latent Chronic Toxoplasma Gondii Infection in Triggering, Development and Persistence of Autoimmune Diseases
T. gondii tachyzoites infect almost all nucleated cells and their intracellular multiplication and lifelong persistence in the host cells play an important role in triggering and development of autoimmune diseases (ADs).


Peptide Vocabulary Analysis Reveals Ultra-Conservation and Homonymity in Protein Sequences
Different species are found to have qualitatively different major peptide vocabularies, e.g. some are dominated by large gene families, while others are rich in simple repeats or dominated by internally repetitive proteins, suggesting the possibility of a peptide vocabulary signature, analogous to genome signatures in DNA.
Two distinct proteolytic processes in the generation of a major histocompatibility complex class I-presented peptide.
Two different proteolytic steps in the generation of an chicken ovalbumin-presented peptide can be distinguished, and distinct peptidase(s) in the cytosol or endoplasmic reticulum may generate the appropriate N terminus from extended peptides.
Cell biology of antigen processing in vitro and in vivo.
This review concentrates on the properties of antigen-presenting cells, especially those aspects of their overall organization, regulation, and intracellular transport that both facilitate and modulate the processing of protein antigens.
The Revised Classification of Eukaryotes
This revision of the classification of eukaryotes retains an emphasis on the protists and incorporates changes since 2005 that have resolved nodes and branches in phylogenetic trees.
The parasitophorous vacuole membrane surrounding Plasmodium and Toxoplasma: an unusual compartment in infected cells.
It is concluded that most differences between the organisms primarily reflect the different biosynthetic capacities of the host cells they invade.
Complexity: an internet resource for analysis of DNA sequence complexity
Several numerical measures of textual complexity, including combinatorial and linguistic ones, together with complexity estimation using a modified Lempel-Ziv algorithm, have been implemented in a software tool called 'Complexity' (
One common structural feature of “words” in protein sequences and human texts
Analysis of the vocabularies shows that in both type of texts (human languages and proteins) the alternating words are dominant or highly preferred, thus, strengthening the analogy between these two types of texts.
Language-like behavior of protein length distribution in proteomes
The results showed that the protein length distribution in the complete set of proteomic proteins, or at least in a wide range for each proteome, can be described reasonably well using the distribution model without considering any complex underlying mechanisms.