Tatyana G. Popova

Learn More
In several recent papers new gene-detection algorithms were proposed for detecting protein-coding regions without requiring a learning dataset of already known genes. The fact that unsupervised gene-detection is possible is closely connected to the existence of a cluster structure in oligomer frequency distributions. In this paper we study the cluster(More)
Coding information is the main source of heterogeneity (non-randomness) in the sequences of microbial genomes. The heterogeneity corresponds to a cluster structure in triplet distributions of relatively short genomic fragments (200-400 bp). We found a universal 7-cluster structure in microbial genomic sequences and explained its properties. We show that(More)
Three results are presented. First, we prove the existence of a universal 7-cluster structure in all 143 completely sequenced bacterial genomes available in Genbank in August 2004, and explained its properties. The 7-cluster structure is responsible for the main part of sequence heterogeneity in bacterial genomes. In this sense, our 7 clusters is the basic(More)
Statistical parameters of nucleotide sequences of mature human RNAs and those of human viruses were compared. The redundancy values of the appropriate genes were compared. The redundancy of virus genes was shown to be, on the average, less than that of human genes. The distribution of human genes according to redundancy values is bimodal, and that of human(More)
The investigation evaluated the effect of various volatile anesthetics on cerebral blood volume and oxygen status in sick children at the stage of anesthesia induction. Ninety-two children were distributed into 3 groups: Groups 1 (n = 36) and 2 (n = 24) underwent stepwise induction with halothane and enflurane, respectively. Group 3 (n = 32) had vital(More)
Self-training technique for automated gene recognition both in entire genomes and in unassembled ones is proposed. It is based on a simple measure (namely, the vector of frequencies of non-overlapping triplets in sliding window), and needs neither predetermined information, nor preliminary learning. The sliding window length is the only one tuning(More)
Coding information is the main source of heterogeneity (non-randomness) in the sequences of bacterial genomes. This information can be naturally modeled by analysing cluster structures in the “in-phase” triplet distributions of relatively short genomic fragments (200-400bp). We found a universal 7-cluster structure in all 143 completely sequenced bacterial(More)
A new microtechnique was used to determine the nucleic acid content of Chironomus polytene chromosomes. The method based on UV-microspectrophotometric measurement of alkali digested chromosome samples before and after gel filtration through Sephadex microcolumns permits the simultaneous estimation of DNA as well as RNA amount of single chromosomes and(More)
Motivation: What are proteins made from, as the working parts of the living cells protein machines? To answer this question, we need a technology to disassemble proteins onto elementary functional details and to prepare lumped description of such details. This lumped description might have a multiple material realization (in amino acids). Our hypothesis is(More)
This paper is devoted to the comparative study of redundancy of genetic texts of various organisms and viruses. To determine the redundance of a gene, we have introduced the strict measure for that latter. The measure for a text's redundance is the length of restriction of Frequency/Correlation Dictionary of a given genetic text. Frequency/Correlation(More)