- Full text PDF available (9)
Data Set Used
The amount of textual information published on the Internet is considered to be in billions of web pages, blog posts, comments, social media updates and others. Analyzing such quantities of data requires high level of distribution – both data and computing. This is especially true in case of complex algorithms, often used in text mining tasks. The paper… (More)
The paper presents a graph-based, shallow semantic analysis-driven approach for modeling document contents. This allows to extract additional information about meaning of text and effects in improved document classification. Its performance is compared against the “legacy” bag-of-words and Schenker et al. approaches with k - NN classification… (More)
Agent-based framework dedicated to acquiring and processing heterogeneous data, collected in various Internet sources is presented. It is built upon a hierarchical, distributed computation system Age that has already been successfully used for various optimization and classification task.