Learn More
For the past year, we have been assembling requirements from a collection of scientific data base users from astronomy, particle physics, fusion, remote sensing, oceanography, and biology. The intent has been to specify a common set of requirements for a new science data base system, which we call SciDB. In addition, we have discovered that very complex(More)
In CIDR 2009, we presented a collection of requirements for SciDB, a DBMS that would meet the needs of scientific users. These included a nested-array data model, sciencespecific operations such as regrid, and support for uncertainty, lineage, and named versions. In this paper, we present an overview of SciDB’s key features and outline a demonstration of(More)
Industrial and scientific datasets have been growing enormously in size and complexity in recent years. The largest transactional databases and data warehouses can no longer be hosted cost-effectively in off-the-shelf commercial database management systems. There are other forums for discussing databases and data warehouses, but they typically deal with(More)
The 3.2 giga-pixel LSST camera will produce approximately half a petabyte of archive images every month. These data need to be reduced in under a minute to produce real-time transient alerts, and then added to the cumulative catalog for further analysis. The catalog is expected to grow about three hundred terabytes per year. The data volume, the real-time(More)
Developers and users of high-performance distributed systems often observe performance problems, the reasons for which are rarely obvious. Bottlenecks can occur in any of the components along the paths through which the data flows: the applications, the operating systems, the hosts, or the network. We have developed a methodology, known as NetLogger, for(More)
The LSST project will provide public access to a database catalog that, in its final year, is estimated to include 26 billion stars and galaxies in dozens of trillion detections in multiple petabytes. Because we are not aware of an existing open-source database implementation that has been demonstrated to efficiently satisfy astronomers' spatial(More)
The BABAR database, based upon the Objectivity OO database management system, has been in production since early 1999. It has met its initial design requirements which were to accommodate a 100Hz event rate from the experiment at a scale of 200TB per year. However, with increased luminosity and changes in the physics requirements, these requirements have(More)
The Large Synoptic Survey Telescope (LSST) will catalog billions of astronomical objects and trillions of sources, all of which will be stored and managed by a database management system. One of the main challenges is real-time alert generation. To generate alerts, up to 100K new difference detections have to be cross-correlated with the huge historical(More)
The amount of data collected and stored by the average business doubles each year. Many commercial databases are already approaching hundreds of terabytes, and at this rate, will soon be managing petabytes. More data enables new functionality and capability, but the larger scale reveals new problems and issues hidden in “smaller” terascale environments.(More)