Learn More
  • Helen Skaletsky, Tomoko Kuroda-Kawaguchi, Patrick J Minx, Holland S Cordum, LaDeana Hillier, Laura G Brown +34 others
  • 2003
The male-specific region of the Y chromosome, the MSY, differentiates the sexes and comprises 95% of the chromosome's length. Here, we report that the MSY is a mosaic of heterochromatic sequences and three classes of euchromatic sequences: X-transposed, X-degenerate and ampliconic. These classes contain all 156 known transcription units, which include 78(More)
Newly emerged event-based online social services, such as Meetup and Plancast, have experienced increased popularity and rapid growth. From these services, we observed a new type of social network - <i>event-based social network</i> (EBSN). An EBSN does not only contain online social interactions as in other conventional online social networks, but also(More)
To meet the challenge of processing rapidly growing graph and network data created by modern applications, a number of distributed graph processing systems have emerged, such as Pregel and GraphLab. All these systems divide input graphs into partitions, and employ a " think like a vertex " programming model to support iterative graph computation. This(More)
The Starburst project, at IBM's Almaden Research Center , is improving the design of relational database management systems and enhancing their performance, while building an extensible system to better support nontraditional applications (such as engineering , geographic, office, etc.), and to serve as a testbed for future improvements in database(More)
Many modern enterprises are collecting data at the most detailed level possible, creating data repositories ranging from terabytes to petabytes in size. The ability to apply sophisticated statistical analysis methods to this data is becoming essential for marketplace competitiveness. This need to perform deep analysis over huge data repositories poses a(More)
Hadoop has become an attractive platform for large-scale data ana-lytics. In this paper, we identify a major performance bottleneck of Hadoop: its lack of ability to colocate related data on the same set of nodes. To overcome this bottleneck, we introduce CoHadoop, a lightweight extension of Hadoop that allows applications to control where data are stored.(More)
A database management system architecture is described that facilitates the implementation of data management extensions for relational database systems. The architecture defines two classes of data management extensions alternative ways of storing relations called relation &#8220;storage methods&#8221;, and access paths, integrity constraints, or triggers(More)
Ion channel mutations are an important cause of rare Mendelian disorders affecting brain, heart, and other tissues. We performed parallel exome sequencing of 237 channel genes in a well-characterized human sample, comparing variant profiles of unaffected individuals to those with the most common neuronal excitability disorder, sporadic idiopathic epilepsy.(More)
Modern document collections often contain groups of documents with overlapping or shared content. However, most information retrieval systems process each document separately, causing shared content to be indexed multiple times. In this paper, we describe a new document representation model where related documents are organized as a tree, allowing shared(More)