Learn More
In this article we present the application of transformation-based learning (TBL) [1] to the task of assigning tags to postings in online chat conversations. We define a list of posting tags that have proven useful in chat-conversation analysis. We describe the templates used for posting act tagging in the context of template selection. We extend(More)
The ephemeral nature of human communication via networks today poses interesting and challenging problems for information technologists. The Intelink intelligence network, for example, has a need to monitor chat-room conversations to ensure the integrity of sensitive data being transmitted via the network. However, the sheer volume of communication in(More)
In this article we present a semi-supervised active learning algorithm for pattern discovery in information extraction from textual data. The patterns are reduced regular expressions composed of various characteristics of features useful in information extraction. Our major contribution is a semi-supervised learning algorithm that extracts information from(More)
The burgconing amount of textual data in distributed sources combined with the obstacles involved in creating and maintaining central repositories motivates the need for effective distributed information extraction and mining techniques. Recently, as the need to mine patterns across distributed databases has grown, Distributed Association Rule Mining(More)
The ephemeral nature of human communication via networks today poses interesting and challenging problems for information technologists. The sheer volume of communication in venues such as email, newsgroups, and chat precludes manual techniques of information management. Currently, no systematic mechanisms exist for accumulating these artifacts of(More)
In this article we present a semi-supervised algorithm for pattern discovery in information extraction from textual data. The patterns that are discovered take the form of regular expressions that generate regular languages. We term our approach ‘semi-supervised’ because it requires significantly less effort to develop a training set than other approaches.(More)
In this article we present a supervised learning algorithm for the discovery of finite state automata in the form of regular expressions in textual data. The automata generate languages that consist of various representations of features useful in information extraction. We have successfully applied this learning technique in the extraction of textual(More)
The burgeoning amount of textual data in distributed sources combined with the obstacles involved in creating and maintaining central repositories motivates the need for effective distributed information extraction and mining techniques. Recently, as the need to mine patterns across distributed databases has grown, Distributed Association Rule Mining(More)