Telling apart Felidae and Ursidae from the distribution of nucleotides in mitochondrial DNA.

  title={Telling apart Felidae and Ursidae from the distribution of nucleotides in mitochondrial DNA.},
  author={Andrij A. Rovenchak},
  journal={arXiv: Other Quantitative Biology},
  • A. Rovenchak
  • Published 7 February 2018
  • Biology
  • arXiv: Other Quantitative Biology
Rank--frequency distributions of nucleotide sequences in mitochondrial DNA are defined in a way analogous to the linguistic approach, with the highest-frequent nucleobase serving as a whitespace. For such sequences, entropy and mean length are calculated. These parameters are shown to discriminate the species of the Felidae (cats) and Ursidae (bears) families. From purely numerical values we are able to see in particular that giant pandas are bears while koalas are not. The observed linear… 

Figures and Tables from this paper

On the Verge of Life: Distribution of Nucleotide Sequences in Viral RNAs

It is observed that proximity of viruses on planes spanned on various pairs of parameters corresponds to related species in certain cases, and thus for the expansion of the set of parameters used in the classification of viruses.

Approaches to the classification of complex systems: Words, texts, and more

The Chapter discusses entropy as one of the parameters, which can be easily computed from rank–frequency dependences, which being a discriminating parameter in some problems of classification of complex systems can be given a proper interpretation only in a limited class of problems.

Quantitation and Comparison of Phenotypic Heterogeneity Among Single Cells of Monoclonal Microbial Populations

Two widely applicable indices for quantitation of heterogeneity were developed and the HC was found to provide a more accurate and precise measure of heterogeneity, being at the same time consistent with the coefficient of variation (CV) applied so far.

Counting Stylometric Properties of Sonnets: A Case Study of Machar's Letní sonety

A sample of the Czech sonnet production – Letní sonety (1890–91) by Josef Svatopluk Machar, a poet of the 1890s generation – will be analysed and various stylistic indicators will be calculated.



Analysis of complete mitochondrial genome sequences increases phylogenetic resolution of bears (Ursidae), a mammalian family that experienced rapid speciation

This study revisits the contentious relationships within Ursidae by analyzing complete mt genome sequences and evaluating the performance of both entire mt genomes and constituent mtDNA genes in recovering a phylogeny of extremely recent speciation events, providing strong evidence that the spectacled bear diverged first.

Animal mitochondrial DNA: structure and evolution.

Evolutionary dynamics of selfish DNA explains the abundance distribution of genomic subsequences

A model of selfish DNA expansion is developed that finds that selfish DNA elements, such as those belonging to the Alu family of repeats, dominate the power-law tail.

Language-like behavior of protein length distribution in proteomes

The results showed that the protein length distribution in the complete set of proteomic proteins, or at least in a wide range for each proteome, can be described reasonably well using the distribution model without considering any complex underlying mechanisms.

Self-organization of genic and intergenic sequence lengths in genomes: Statistical properties and linguistic coherence

This study provides a general picture of the large-scale self-organization of coding, noncoding, and total constituent lengths in genomes by adopting a linguistic distribution model and a structural analogy between linguistic and genomic constructs.

On the similarity of symbol frequency distributions with heavy tails

It is found that frequent words change more slowly than less frequent words and that $\alpha=2$ provides the most robust measure to quantify language change, a complete $\alpha$-spectrum of measures.

Part-of-Speech Sequences in Literary Text: Evidence From Ukrainian

It is shown that Zipf’s law holds for parts-of-speech sequences in Ukrainian texts by Ivan Franko, and it is expected that further studies of the proposed PoSW units both in Ukrainian and other languages can reveal new features of texts on the sentence and supra-sentence levels.


New methods to classify life into Archaea, Bacteria and Eucarya are found based on the correlation analysis and spectral analysis of protein length distributions to show that there is rich evolutionary information stored in the fluctuations ofprotein length distributions.

Red panda : biology and conservation of the first panda

A broad-based overview of the biology of the red panda, Ailurus fulgens, which discusses the status of the species in the wild, examines how human activities impact on their habitat, and develops projections to translate this in terms of overall panda numbers.