Boris Dadachev

Learn More
Unusual behaviour detection and information extraction in streams of short documents and files (emails, news, tweets, log files, messages, etc.) are important problems in security applications. In [1], [2], a new approach to rapid change detection and automatic summarization of large documents was introduced. This approach is based on a theory of social(More)
Automatic text segmentation, which is the task of breaking a text into topically-consistent segments, is a fundamental problem in Natural Language Processing, Document Classification and Information Retrieval. Text segmentation can significantly improve the performance of various text mining algorithms, by splitting heterogeneous documents into homogeneous(More)
The majority of text mining systems rely on bag-of-words approaches, representing textual documents as multi-sets of their constituent words. Using term weighting mechanisms, this simple representation allows to derive features that can be used as input by many different algorithms and for a variety of applications, including document classification,(More)
  • 1