Learn More
Multiscale methods are becoming increasingly promising as a way to characterize the dynamics of large protein systems on biologically relevant time-scales. The underlying assumption in multiscale simulations is that it is possible to move reliably between different resolutions. We present a method that efficiently generates realistic all-atom protein(More)
MOTIVATION Finding novel or non-standard metabolic pathways, possibly spanning multiple species, has important applications in fields such as metabolic engineering, metabolic network analysis and metabolic network reconstruction. Traditionally, this has been a manual process, but the large volume of metabolic data now available has created a need for(More)
The virulence of Mycobacterium tuberculosis depends on the ability of the bacilli to switch between replicative (growth) and non-replicative (dormancy) states in response to host immunity. However, the gene regulatory events associated with transition to dormancy are largely unknown. To address this question, we have assembled the largest M. tuberculosis(More)
Any given Web search engine may provide higher quality results than others for certain queries. Therefore, it is in users' best interest to utilize multiple search engines. In this paper, we propose and evaluate a framework that maximizes users' search effective-ness by directing them to the engine that yields the best results for the current query. In(More)
Systems biology is a broad field that incorporates both computational and experimental approaches to provide a system level understanding of biological function. Initial forays into computational systems biology have focused on a variety of biological networks such as protein–protein interaction, signaling, transcription and metabolic networks. In this(More)
BACKGROUND As large genomics and phenotypic datasets are becoming more common, it is increasingly difficult for most researchers to access, manage, and analyze them. One possible approach is to provide the research community with several petabyte-scale cloud-based computing platforms containing these data, along with tools and resources to analyze it. (More)
In this paper we describe the design, and implementation of the Open Science Data Cloud, or OSDC. The goal of the OSDC is to provide petabyte-scale data cloud infrastructure and related services for scientists working with large quantities of data. Currently, the OSDC consists of more than 2000 cores and 2 PB of storage distributed across four data centers(More)
Hadoop has emerged as an important platform for data intensive computing. The shuffle and sort phases of a MapReduce computation often saturate top of the rack switches, as well as switches that aggregate multiple racks. In addition, MapReduce computations often have "hot spots" in which the computation is lengthened due to inadequate bandwidth to some of(More)
The activity of most drugs is regulated by the binding of one molecule (the lig-and) to a pocket of another, usually larger, molecule, which is commonly a protein. This report describes a new approach to creating low-energy structures of flexible proteins to which ligands can be docked. The flexibility of molecules is encoded with thousands of parameters(More)