Inferring the Most Likely Geographical Origin of mtDNA Sequence Profiles

@article{Egeland2004InferringTM,
  title={Inferring the Most Likely Geographical Origin of mtDNA Sequence Profiles},
  author={Thore Egeland and Hege M. B{\o}velstad and Geir Storvik and Antonio Salas},
  journal={Annals of Human Genetics},
  year={2004},
  volume={68}
}
In a number of practical cases it is important to determine the likely geographical origin of an individual or a biological sample. A dead body, old bones or a sample of semen may be available. Information on where the sample might come from can assist investigation or research. The first part of this paper is independent of specific data structure. We formulate the problem as a classification problem. Bayes' theorem allows different sources of information or data to be reconciled conveniently… 
Inferring ethnicity from mitochondrial DNA sequence
TLDR
Support vector machines can be used to infer coarse ethnicity from a small region of mitochondrial DNA sequence with surprisingly high accuracy and are likely to also be useful in other DNA sequence classification applications.
Estimating Haplotype Frequency and Coverage of Databases
TLDR
This paper proposes different approaches to the problem based on classical methods as well as new applications of Principal Component Analysis (PCA) and discusses previous proposals based on saturation curves.
The application of machine learning to predict genetic relatedness using human mtDNA hypervariable region I sequences
TLDR
The ability of ML algorithms to predict genetic relatedness using hypervariable region I sequences retrieved from the GenBank database for three race groups, namely African, Asian and Caucasian is investigated, providing evidence that ML can be utilized as a supplementary tool for forensic genetics casework analysis.
Statistical Evaluation of Haploid Genetic Evidence
TLDR
The bootstrap method is chosen, which assumes access to the relevant databases sampled from populations that the perpetrator may conceivably come from and the ability of the proposed method to count for population stratification when computing the LR is discussed.
A Statistical Framework for the Interpretation of mtDNA Mixtures: Forensic and Medical Applications
TLDR
It is argued that clinical and forensic scientists should give greater consideration to mtDNA for mixture interpretation and show that the analysis of mtDNA mixtures contributes substantially to forensic casework and may also clarify erroneous claims made in clinical genetics regarding tumorigenesis.
Fine-Scale Estimation of Location of Birth from Genome-Wide Single-Nucleotide Polymorphism Data
TLDR
This work uses data from the Northern Finland Birth Cohort 1966 to investigate the characteristics of genetic structure within a population and develops a method for inferring location to a finer scale, which is potentially valuable in population genetics and forensics.
Population inference based on mitochondrial DNA control region data by the nearest neighbors algorithm
TLDR
The KNN algorithm and the K-weighted-nearest neighbors (KWNN) algorithm weighted by genetic distance to classify individuals into continental populations, and subpopulations within the same continent, is used.
...
1
2
3
4
...

References

SHOWING 1-10 OF 22 REFERENCES
Inference of population structure using multilocus genotype data.
TLDR
A model-based clustering method for using multilocus genotype data to infer population structure and assign individuals to populations that can be applied to most of the commonly used genetic markers, provided that they are not closely linked.
An annotated mtDNA database
TLDR
A geographic information system which searches for closest matches to a given mtDNA control region sequence and displays them on a geographic map is implemented and it is suggested that the geographic area with the highest frequency of closely related mtDNA sequence types may be used to define a reference population.
The fingerprint of phantom mutations in mitochondrial DNA data.
TLDR
To identify the telltale patterns of a particular phantom mutation process, one first filters out the well-established frequent mutations (inferred from various data sets with additional coding region information), to avoid errors that otherwise would go into print and could lead to erroneous evolutionary interpretations.
Tracing European founder lineages in the Near Eastern mtDNA pool.
mtDNA and the origin of the Icelanders: deciphering signals of recent population history.
TLDR
The findings indicate that European populations contain a large number of closely related mitochondrial lineages, many of which have not yet been sampled in the current comparative data set, and substantial increases in sample sizes will be needed to obtain valid estimates of the diverse ancestral mixtures that ultimately gave rise to contemporary populations.
Patterns of human diversity, within and among continents, inferred from biallelic DNA polymorphisms.
TLDR
There is little evidence, if any, of a clear subdivision of humans into biologically defined groups at random biallelic loci, according to a range of statistical methods.
Informativeness of genetic markers for inference of ancestry.
TLDR
In a worldwide human microsatellite data set, a general measure, the informativeness for assignment (I(n), is introduced, applicable to any number of potential source populations, for determining the amount of information that multiallelic markers provide about individual ancestry.
Assessing ethnicity from human mitochondrial DNA types determined by hybridization with sequence-specific oligonucleotides.
TLDR
A logistic regression model was developed to predict ethnic group from mitochondrial DNA types determined by hybridization with sequence-specific oligonucleotide probes of the two hypervariable segments of the mtDNA control region and correctly predicted the ethnic group of 65.3% of the overall sample; however, the success rate varied substantially among ethnic groups.
Inferring ethnic origin by means of an STR profile.
The results of an mtDNA study of 1200 inhabitants of a German village in comparison to other Caucasian databases and its relevance for forensic casework
TLDR
The number of different haplotypes and the haplotype diversity were calculated for four short amplicons of HV1 in order to establish the most variable section with a high efficiency for forensic casework.
...
1
2
3
...