The new arboretum of Indo-European “trees”. Can new algorithms reveal the phylogeny and even prehistory of Indo-European?*

  title={The new arboretum of Indo-European “trees”. Can new algorithms reveal the phylogeny and even prehistory of Indo-European?*},
  author={Hans J. Holm},
  journal={Journal of Quantitative Linguistics},
  pages={167 - 214}
  • Hans J. Holm
  • Published 1 August 2007
  • Geography
  • Journal of Quantitative Linguistics
Abstract Specialization in linguistics vs. biological informatics leads to widespread misunderstandings and false results caused by poor knowledge of the essential conditions of the respective methods and data applied. These are analyzed and the insights used to assess the recent glut of attempts to employ methods from biological informatics in establishing new phylogenies of Indo-European languages. 
Tutorial on Computational Linguistic Phylogeny
This tutorial surveys the different methods and different types of linguistic data that have been used to estimate phylogenies, explains the scientific and mathematical foundations of phylogenetic estimation, and presents methodologies for evaluating a phylogeny estimation method.
Using ancestral state reconstructionmethods for onomasiological reconstruction in multilingual word lists *
A pilot study exploring how well automatic methods for ancestral state reconstruction perform in the task of onomasiological reconstruction in multilingual word lists finds that Maximum Likelihood largely outperforms the other methods, but the general performance was disappointingly low.
Detecting non-tree-like signal using multiple tree topologies
It is shown that the multiple topologies method is a useful tool to study the dynamics of language evolution and suggested that reticulation may arise from a number of processes, including dialect chain break-up, borrowing, and characteristics of lexical datasets.
Austronesian language phylogenies: myths and misconceptions about Bayesian computational methods
Phylogenetic analyses of structural features have revealed historical signals in Papuan and reflected a settlement pattern through Island South-East Asia, New Guinea and then into Oceania, consistent with the ‘Out of Taiwan’ scenario.
Annotating Cognates in Phylogenetic Studies of South-East Asian Languages
Compounding and derivation are frequent in many language families. As a consequence, words in different languages are often only partially cognate, sharing only a few but not all morphemes. While
The Distribution of Data in Word Lists and its Impact on the Subgrouping of Languages
This work reveals the reason for the bias in the separation levels computed for natural languages with only a small amount of residues and shows that the Anatolian languages did not part as first ones and thereby refutes the Indo-Hittite hypothesis.
Challenges of annotation and analysis in computer-assisted language comparison: A case study on Burmish languages
This paper illustrates specific challenges which both computational and classical approaches encounter when studying South-East Asian languages and points to the challenges resulting from missing annotation standards and insufficient methods for analysis within a computer-assisted framework.
Networks of lexical borrowing and lateral gene transfer in language and genome evolution
Network approaches that were originally designed to study lateral gene transfer may provide more realistic insights into the complexities of language evolution.
Beautiful Trees on Unstable Ground
The main problems of lexicostatistics and glottochronology, the translation of basic concepts into individual languages and the execution of cognate judgments, are still so grave that no reliable results can be drawn from this methods.


Indo‐European and Computational Cladistics
An attempt to recover the first-order subgrouping of the Indo-European family using a new computational method devised by the authors and based on a ‘perfect phylogeny’ algorithm is reported.
A comparison of phylogenetic reconstruction methods on an Indo‐European dataset
Researchers interested in the history of the Indo-European family of languages have used a variety of methods to estimate the phylogeny of the family, and have obtained widely differing results. In
A Comparison of Phylogenetic Reconstruction Methods on an IE Dataset
Researchers interested in the history of the Indo-European family of languages have used a variety of methods to estimate the phylogeny of the family, and have obtained widely differing results. In
Language-tree divergence times support the Anatolian theory of Indo-European origin
An analysis of a matrix of 87 languages with 2,449 lexical items produced an estimated age range for the initial Indo-European divergence of between 7,800 and 9,800 years bp, in striking agreement with the Anatolian hypothesis.
Genealogy of the Main Indo-European Branches Applying the Separation Base Method*
The customary black and white hypotheses, e.g., pro or contra an Italo-Celtic relationship, cannot do justice to the real developments and must give way to this more differentiated overall view.
Toward a phylogenetic chronology of ancient Gaulish, Celtic, and Indo-European
  • P. Forster, A. Toth
  • Biology
    Proceedings of the National Academy of Sciences of the United States of America
  • 2003
The phylogenetic network reveals an early split of Celtic within Indo-European, and suggests that the Celtic language arrived in the British Isles as a single wave (and then differentiated locally), rather than in the traditional two-wave scenario.
Cladistic analysis of languages: Indo‐European classification based on lexicostatistical data
The results suggest a predominantly branching pattern of the basic vocabulary phylogeny and little borrowing of individual words.
Bayesian Inference of Phylogeny and Its Impact on Evolutionary Biology
Bayesian inference of phylogeny brings a new perspective to a number of outstanding issues in evolutionary biology, including the analysis of large phylogenetic trees and complex evolutionary models and the detection of the footprint of natural selection in DNA sequences.
This paper shows how suitable evolutionary models can be constructed and applied objectively and how the type of data will affect both the method of treatment and the validity of the results.