Sebastian Schönherr

Learn More
The MapReduce framework enables a scalable processing and analyzing of large datasets by distributing the computational load on connected computer nodes, referred to as a cluster. In Bioinformatics, MapReduce has already been adopted to various case scenarios such as mapping next generation sequencing data to a reference genome, finding SNPs from short read(More)
BACKGROUND Oral squamous cell carcinoma (OSCC) is mainly caused by smoking and alcohol abuse and shows a five-year survival rate of ~50%. We aimed to explore the variation of somatic mitochondrial DNA (mtDNA) mutations in primary oral tumors, recurrences and metastases. METHODS We performed an in-depth validation of mtDNA next-generation sequencing (NGS)(More)
BACKGROUND Genome-wide association studies (GWAS) based on single nucleotide polymorphisms (SNPs) revolutionized our perception of the genetic regulation of complex traits and diseases. Copy number variations (CNVs) promise to shed additional light on the genetic basis of monogenic as well as complex diseases and phenotypes. Indeed, the number of detected(More)
Myanmar is the largest country in mainland Southeast Asia with a population of 55 million people subdivided into more than 100 ethnic groups. Ruled by changing kingdoms and dynasties and lying on the trade route between India and China, Myanmar was influenced by numerous cultures. Since its independence from British occupation, tensions between the ruling(More)
High-throughput technologies, such as next-generation sequencing, have turned molecular biology into a data-intensive discipline, requiring bioinformaticians to use high-performance computing resources and carry out data management and analysis tasks on large scale. Workflow systems can be useful to simplify construction of analysis pipelines that automate(More)
BACKGROUND Mitochondrial DNA (mtDNA) is widely being used for population genetics, forensic DNA fingerprinting and clinical disease association studies. The recent past has uncovered severe problems with mtDNA genotyping, not only due to the genotyping method itself, but mainly to the post-lab transcription, storage and report of mtDNA genotypes. (More)
BACKGROUND High-throughput genotyping and phenotyping projects of large epidemiological study populations require sophisticated laboratory information management systems. Most epidemiological studies include subject-related personal information, which needs to be handled with care by following data privacy protection guidelines. In addition, genotyping core(More)
Today filesystems of big companies are both huge and distributed amongst the world. They contain huge sets of metadata, but are not optimized to analyze them. In contrast, if metadata is stored in a database system and updated synchronously, it could be analyzed and processed in a much easier and straightforward way. Then even adding new attributes, not(More)