Charles L. Viles

Learn More
We compare the performance of two database se lection algorithms reported in the literature Their perfor mance is compared using a common testbed designed specif ically for database selection techniques The testbed is a de composition of the TREC TIPSTER data into subcol lections The databases from our testbed were ranked using both the gGlOSS and CORI(More)
We describe a testbed for database selection techniques and an experiment conducted using this testbed. The testbed is a decomposition of the TREC/TIPSTER data that allows analysis of the data along multiple dimensions, including collection-based and temporal-based analysis. We characterize the subcollections in this testbed in terms of number of documents,(More)
This paper describes an algorithm for calculating the biovolume of cells with simple shapes, such as bacteria, flagellates, and simple ciliates, from a 2-dimensional digital image. The method can be adapted to any image analysis system which allows access to the binary cell image--(i.e., the pixels, or (x,y) points, composing the cell. The cell image is(More)
We find that dissemination of collection wide information (CWI) in a distributed collection of documents is needed to achieve retrieval effectiveness comparable to a centralized collection. Complete dissemination is unnecessary. The required dissemination level depends upon how documents are allocated among sites. Low dissemination is needed for random(More)
The proliferation of online information resources increases the importance of effective and efficient distributed searching. Distributed searching is cast in three parts — database selection, query processing, and results merging. In this paper we examine the effect of database selection on retrieval performance. We look at retrieval performance in(More)
Using the vector space information retrieval model, we show that the update of term weights under document insertions is computationally expensive for weighting schemes that use collection statistics and normalization by document vector lengths. In the dynamic setting, we argue that strict adherence to such schemes is impractical and unnecessary x long as(More)
During a 90-day period in 1994, we measured the availability and connection latency of HTTP (hypertext transfer protocol) information servers. These measurements 'were made from a site in the Eastern United States. The list of servers included 189 servers from Europe and 324 servers from North America. Our measurements indicate that on average, 5.0 percent(More)
• collection management; • organizing and indexing the materials for storage We find that dissemination of collection-wide information (CWI) in a distributed collection of documents is needed to and retrieval; achieve retrieval effectiveness comparable to that of a central• user interfaces and human-computer interaction; and ized collection. Complete(More)
Accurate measurement of the biomass and size distribution of picoplankton cells (0.2 to 2.0 microns) is paramount in characterizing their contribution to the oceanic food web and global biogeochemical cycling. Image-analyzed fluorescence microscopy, usually based on video camera technology, allows detailed measurements of individual cells to be taken. The(More)
This paper describes the design and implementation of the Legion run-time library (LRTL), focusing speci cally on facilities that enable extensibility and con gurability. These facilities include management of heterogeneous communication, an event-based mechanism for intercomponent communication, and automated memory management. The paper provides several(More)