• Publications
  • Influence
AlphaSort: a RISC machine sort
TLDR
A new sort algorithm, called AlphaSort, demonstrates that commodity processors and disks can handle commercial batch workloads and proposes two new benchmarks: Minutesort: how much can you sort in a minute, and DollarSort: how to sort for a dollar.
Microsoft TerraServer: a spatial data warehouse
TLDR
Terabytes of "Internet unfriendly" geo-spatial images were scrubbed and edited into hundreds of millions of “Internet friendly” image tiles and loaded into a SQL data warehouse, demonstrating that general-purpose relational database technology can manage large scale image repositories and shows that web browsers can be a good geo- Spatial image presentation system.
Alphasort: A cache-sensitive parallel external sort
TLDR
A new sort algorithm, called AlphaSort, demonstrates that commodity processors and disks can handle commercial batch workloads and argues that modern architectures require algorithm designers to re-examine their use of the memory hierarchy.
TeraScale SneakerNet: Using Inexpensive Disks for Backup, Archiving, and Data Exchange
TLDR
This article describes how the Sloan Digital Sky Survey ships terabyte scale datasets both within the US and to Europe and Asia, and some software issues that they raise.
Loading databases using dataflow parallelism
TLDR
This paper describes the optimizer's cost-based hierarchical optimization strategy in some detail and preliminary measurements indicate that this design will give excellent scaleups.
TerraService.NET: An Introduction to Web Services
TLDR
The article presents the design of two USDA applications that interoperate with database and web service resources in Fort Collins Colorado and the TerraService web service located in Tukwila Washington.
Designing and Building TerraService
A few simple rules guide the design of Web services such as TerraService, a geospatial service added to Microsoft's popular TerraServer database. By sticking to standards-based tools, the authors
TerraServer Bricks - A High Availability Cluster Alternative
TLDR
The hardware and software components of the TerraServer Bricks are described and the experience in configuring and operating this environment for the first year is described, aiming to operate the popular TerraServer web site with the same or higher availability than the Terra server SAN at a fraction of the system and operations cost.
...
...