I-Jen Chiang

This paper presents a novel approach to document clustering based on some geometric structure in Combinatorial Topology. Given a set of documents, the set of associations among frequently co-occurring terms in documents forms naturally a simplicial complex. Our general thesis is each connected component of this simplicial complex represents a concept in the(More)
Overly generalized predictions are a serious problem in concept classiÿcation. In particular, the boundaries among classes are not always clearly deÿned. For example, there are usually uncertainties in diagnoses based on data from biochemical laboratory examinations. Such uncertainties make the prediction be more diicult than noise-free data. To avoid such(More)