Learn More
The paper provides theoretical justification for the " 3-periodicity property " observed in protein coding regions within genomic DNA sequences. We propose a new classification criteria improving upon traditional frequency based approaches for identification of coding regions. Experimental studies indicate superior performance compared with other algorithms(More)
While the well-known Transport Control Protocol (TCP) is a <i>de facto</i> standard for reliable communication on the Internet, and performs well in practice, the question "how good is the TCP/IP congestion control algorithm?" is not completely resolved. In this paper, we provide some answers to this question using the competitive analysis framework. First,(More)
Flow cytometry (FC) is a powerful technology for rapid multivariate analysis and functional discrimination of cells. Current FC platforms generate large, high-dimensional datasets which pose a significant challenge for traditional manual bivariate analysis. Automated mul-tivariate clustering, though highly desirable, is also stymied by the critical(More)
Existing Discrete Fourier transform (DFT)-based algorithms for identifying protein coding regions in DNA sequences[9, 2, 3, 7] exploit the empirical observation that the spectrum of protein coding regions of length N nucleotides has a peak at frequency k = N/3. In this paper , we prove the aforementioned and several other empirical observations attributed(More)
We present a model-based clustering method, SWIFT (Scalable Weighted Iterative Flow-clustering Technique), for digesting high-dimensional large-sized datasets obtained via modern flow cytometry into more compact representations that are well-suited for further automated or manual analysis. Key attributes of the method include the following: (a) the analysis(More)
Side effect machines produce features for classifiers that distinguish different types of DNA sequences. They have the, as yet unexploited, potential to give insight into biological features of the sequences. We introduce several innovations to the production and use of side effect machine sequence features. We compare the results of using consensus(More)
P2P networks have been proposed as a scalable, inexpensive solution to the problem of distributing multimedia content over the Internet. Since real P2P systems exhibit considerable heterogeneity in hardware, software and network connections, the design of P2P streaming networks must factor in this variation. There are two different sources of heterogeneity(More)