• Publications
  • Influence
Remarkable Interkingdom Conservation of Intron Positions and Massive, Lineage-Specific Intron Loss and Gain in Eukaryotic Evolution
Sequencing of eukaryotic genomes allows one to address major evolutionary problems, such as the evolution of gene structure. We compared the intron positions in 684 orthologous gene sets from 8Expand
  • 369
  • 33
Algorithms for computing parsimonious evolutionary scenarios for genome evolution, the last universal common ancestor and dominance of horizontal gene transfer in the evolution of prokaryotes
BackgroundComparative analysis of sequenced genomes reveals numerous instances of apparent horizontal gene transfer (HGT), at least in prokaryotes, and indicates that lineage-specific gene loss mightExpand
  • 357
  • 22
Ancestral paralogs and pseudoparalogs and their role in the emergence of the eukaryotic cell
Gene duplication is a crucial mechanism of evolutionary innovation. A substantial fraction of eukaryotic genomes consists of paralogous gene families. We assess the extent of ancestral paralogy,Expand
  • 153
  • 11
Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads
The issue of determining “the right number of clusters” in K-Means has attracted considerable interest, especially in the recent years. Cluster intermix appears to be a factor most affecting theExpand
  • 208
  • 6
Additive clustering and qualitative factor analysis methods for similarity matrices
We review methods of qualitative factor analysis (QFA) developed by the author and his collaborators over the last decade and discuss the use of QFA methods for the additive clustering problem. TheExpand
  • 19
  • 6
Algorithms for additive clustering of rectangular data tables
The overlapping additive clustering model or principal cluster model is a model for two-way two-mode object by variable data that implies an overlapping clustering of the objects and a set ofExpand
  • 20
  • 4
Reinterpreting the Category Utility Function
  • B. Mirkin
  • Computer Science, Mathematics
  • Machine Learning
  • 18 October 2001
The category utility function is a partition quality scoring function applied in some clustering programs of machine learning. We reinterpret this function in terms of the data variance explained byExpand
  • 76
  • 3
Triadic Formal Concept Analysis and triclustering: searching for optimal patterns
This paper presents several definitions of “optimal patterns” in triadic data and results of experimental comparison of five triclustering algorithms on real-world and synthetic datasets. TheExpand
  • 48
  • 3
Choosing the number of clusters
  • B. Mirkin
  • Computer Science
  • Wiley Interdiscip. Rev. Data Min. Knowl. Discov.
  • 1 May 2011
The issue of determining ‘the right number of clusters’ is attracting ever growing interest. The paper reviews published work on the issue with respect to mixture of distributions, partition,Expand
  • 56
  • 3