Learn More
For 10,000 years pigs and humans have shared a close and complex relationship. From domestication to modern breeding practices, humans have shaped the genomes of domestic pigs. Here we present the assembly and analysis of the genome sequence of a female domestic Duroc pig (Sus scrofa) and a comparison with the genomes of wild and domestic pigs from Europe(More)
Data-intensive flow computing allows efficient processing of large volumes of data otherwise unapproachable. This paper introduces a new semantic-driven data-intensive flow infrastructure which: (1) provides a robust and transparent scalable solution from a laptop to large-scale clusters ,(2) creates an unified solution for batch and interactive tasks in(More)
The genome organizations of eight phylogenetically distinct species from five mammalian orders were compared in order to address fundamental questions relating to mammalian chromosomal evolution. Rates of chromosome evolution within mammalian orders were found to increase since the Cretaceous-Tertiary boundary. Nearly 20% of chromosome breakpoint regions(More)
This paper addresses the problem of making text mining results more comprehensible to humanities scholars, journalists, intelligence analysts, and other researchers, in order to support the analysis of text collections. Our system, FeatureLens<sup>1</sup>, visualizes a text collection at several levels of granularity and enables users to explore interesting(More)
Real-time surveillance systems, network and telecommuni-cation systems, and other dynamic processes often generate tremendous (potentially infinite) volume of stream data. Effective analysis of such stream data poses great challenges to database and data mining researchers, due to its unique features, such as single-scan algorithm, multi-dimensional online(More)
This paper describes a system to support humanities scholars in their interpretation of literary work. It presents a user interface and web architecture that integrates text mining, a graphical user interface and visualization, while attempting to remain easy to use by non specialists. Users can interactively read and rate documents found in a digital(More)
The Tibetan antelope (Pantholops hodgsonii) is endemic to the extremely inhospitable high-altitude environment of the Qinghai-Tibetan Plateau, a region that has a low partial pressure of oxygen and high ultraviolet radiation. Here we generate a draft genome of this artiodactyl and use it to detect the potential genetic bases of highland adaptation. Compared(More)
PURPOSE Looping patterns rich in laminin are present in tissue samples of primary aggressive human uveal melanomas and their metastases. Because these extravascular patterns connect to blood vessels and transmit fluid in vitro and in vivo, the three-dimensional configuration of these patterns has been the subject of considerable speculation. In the current(More)
—To mine large digital libraries in humanistically meaningful ways, we need to divide them by genre. This is a task that classification algorithms are well suited to assist, but they need adjustment to address the specific challenges of this domain. Digital libraries pose two problems of scale not usually found in the article datasets used to test these(More)
Mass digitization of historical documents is a challenging problem for optical character recognition (OCR) tools. Issues include noisy backgrounds and faded text due to aging, border/marginal noise, bleed-through, skewing, warping, as well as irregular fonts and page layouts. As a result, OCR tools often produce a large number of spurious bounding boxes(More)