Thorfinn Korneliussen

Learn More
We present a statistical framework for estimation and application of sample allele frequency spectra from New-Generation Sequencing (NGS) data. In this method, we first estimate the allele frequency spectrum using maximum likelihood. In contrast to previous methods, the likelihood function is calculated using a dynamic programming algorithm and numerically(More)
High-throughput DNA sequencing technologies are generating vast amounts of data. Fast, flexible and memory efficient implementations are needed in order to facilitate analyses of thousands of samples simultaneously. We present a multithreaded program suite called ANGSD. This program can calculate various summary statistics, and perform association mapping(More)
Targeted capture combined with massively parallel exome sequencing is a promising approach to identify genetic variants implicated in human traits. We report exome sequencing of 200 individuals from Denmark with targeted capture of 18,654 coding genes and sequence coverage of each individual exome at an average depth of 12-fold. On average, about 95% of the(More)
The rich fossil record of equids has made them a model for evolutionary processes. Here we present a 1.12-times coverage draft genome from a horse bone recovered from permafrost dated to approximately 560-780 thousand years before present (kyr BP). Our data represent the oldest full genome sequence determined so far by almost an order of magnitude. For(More)
BACKGROUND Estimation of allele frequency is of fundamental importance in population genetic analyses and in association mapping. In most studies using next-generation sequencing, a cost effective approach is to use medium or low-coverage data (e.g., < 15X). However, SNP calling and allele frequency estimation in such studies is associated with substantial(More)
A number of different statistics are used for detecting natural selection using DNA sequencing data, including statistics that are summaries of the frequency spectrum, such as Tajima’s D. These statistics are now often being applied in the analysis of Next Generation Sequencing (NGS) data. However, estimates of frequency spectra from NGS data are strongly(More)
A major question in evolutionary biology is how natural selection has shaped patterns of genetic variation across the human genome. Previous work has documented a reduction in genetic diversity in regions of the genome with low recombination rates. However, it is unclear whether other summaries of genetic variation, like allele frequencies, are also(More)
Clovis, with its distinctive biface, blade and osseous technologies, is the oldest widespread archaeological complex defined in North America, dating from 11,100 to 10,700 (14)C years before present (bp) (13,000 to 12,600 calendar years bp). Nearly 50 years of archaeological research point to the Clovis complex as having developed south of the North(More)
Estimates of relatedness have several applications such as the identification of relatives or in identifying disease related genes through identity by descent (IBD) mapping. Here we present a new method for identifying IBD tracts among individuals from genome-wide single nucleotide polymorphisms data. We use a continuous time Markov model where the hidden(More)
We present a new haplotype-based statistic (nSL) for detecting both soft and hard sweeps in population genomic data from a single population. We compare our new method with classic single-population haplotype and site frequency spectrum (SFS)-based methods and show that it is more robust, particularly to recombination rate variation. However, all statistics(More)