Chidanand Apté

Learn More
We describe the results of extensive experiments using optimized rule-based induction methods on large document collections. The goal of these methods is to discover automatically classification patterns that can be used for general document categorization or personalized filtering of free text. Previous reports indicate that human-engineered rule-based(More)
WITH THE ADVENT OF CENTRALized data warehouses, where data might be stored as electronic documents or as text fields in databases, text mining has increased in importance and economic value. One important goal in text mining is automatic classification of electronic documents. Computer programs scan text in a document and apply a model that assigns the(More)
This paper describes the use of decision tree and rule induction in data mining applications. Of methods for classi cation and regression that have been developed in the elds of pattern recognition, statistics, and machine learning, these are of particular interest for data mining since they utilize symbolic and interpretable representations. Symbolic(More)
Predictive algorithms play a crucial role in systems management by alerting the user to potential failures. We report on three case studies dealing with the prediction of failures in computer systems: (1) long-term prediction of performance variables (e.g., disk utilization), (2) short-term prediction of abnormal behavior (e.g., threshold violations), and(More)
The issues of cross channel integration and customer life time value modeling are two of the most important topics surrounding customer relationship management (CRM) today. In the present paper, we describe and evaluate a novel solution that treats these two important issues in a unified framework of Markov Decision Processes (MDP). In particular, we report(More)
Our experiments with capital markets data suggest that the domain can be e ectively modeled by classi cation rules induced from available historical data for the purpose of making gainful predictions for equity investments. New classi cation techniques developed at IBM Research, including minimal rule generation (R-MINI) and contextual feature analysis,(More)
The UPA (Underwriting Profitability Analysis) application embodies a new approach to mining Property & Casualty (P&C) insurance policy and claims data for the purpose of constructing predictive models for insurance risks. UPA utilizes the ProbE (Probabilistic Estimation) predictive modeling data mining kernel to discover risk characterization rules by(More)
Fingerhut Business Intelligence (BI) has a long and successful history of building statistical models to predict consumer behavior. The models constructed are typically segmentation-based models in which the target audience is split into subpopulations (i.e., customer segments) and individually tailored statistical models are then developed for each(More)