Thomas E. Potok

Learn More
Fast and high-quality document clustering algorithms play an important role in effectively navigating, summarizing, and organizing information. Recent studies have shown that partitional clustering algorithms are more suitable for clustering large datasets. However, the K-means algorithm, the most commonly used partitional clustering algorithm, can only(More)
In this paper, we propose a new term weighting scheme called term frequency-inverse corpus frequency (TF-ICF). It does not require term frequency information from other documents within the document collection and thus, it enables us to generate the document vectors of N streaming documents in linear time. In the context of a machine learning application,(More)
Social animals or insects in nature often exhibit a form of emergent collective behavior known as flocking. In this paper, we present a novel Flocking based approach for document clustering analysis. Our Flocking clustering algorithm uses stochastic and heuristic principles discovered from observing bird flocks or fish schools. Unlike other partition(More)
In the real world, we have to frequently deal with searching for and tracking an optimal solution in a dynamic environment. This demands that the algorithm not only find the optimal solution but also track the trajectory of the solution in a dynamic environment. Particle swarm optimization (PSO) is a population-based stochastic optimization technique, which(More)
One of the approaches used to improve the accuracy and relevancy in information retrieval is cluster analysis. Clustering methods determine relationships among text documents, and allow the determination of similar groups or clusters of documents. These methods are computationally expensive, thereby limiting their use to a relatively small set of documents.(More)
We describe a method for indexing and retrieving high-resolution image regions in large geospatial data libraries. An automated feature extraction method is used that generates a unique and specific structural description of each segment of a tessellated input image file. These tessellated regions are then merged into similar groups, or sub-regions, and(More)
How to organize and classify large amounts of heterogeneous information accessible over the Internet is a major problem faced by industry, government, and military organizations. XML is clearly a potential solution to this problem, [1,2] however, a significant challenge is how to automatically convert information currently expressed in a standard HTML(More)
We assess the novelty and maturity of software (SW) agent-based systems (ABS) for the Future Combat System (FCS) concept. The concept consists of troops, vehicles, communications, and weapon systems viewed as a “system of systems” [including net-centric command and control (C) capabilities]. In contrast to a centralized, or platformbased architecture, FCS(More)
As software development cycles shorten, and software markets become more competitive, improved software development productivity continues to be a major concern in the software industry. Many believe that object-oriented technology provides a breakthrough solution to this problem, but there is little quantitative evidence for this belief. Furthermore, most(More)