Real Time Micro-blog Summarization Based on Hadoop/HBase

Abstract

Micro-blog is a medium of communication that allows users to communicate with each other via short contents. Using the micro-blog as a way of spreading information more broadly has gained much interest as a new social medium where the contents can be delivered in real-time. However, the users should take the trouble to read manually through the posts for understanding a specific topic since the posts have been sorted by time, not relevancy. In this paper, we present a real time application that summarizes the posts by relevancy, considering the time that the posts are written. We set Hadoop environment with HBase since the application needs to be scalable and also, fault-tolerant. Summaries that the application produces are evaluated by ROUGE metric which is a well-known summary evaluation method. The evaluation result indicates that the summaries produced by the application show better results comparing to summaries generated by a traditional summarization method.

DOI: 10.1109/WI-IAT.2013.148
View Slides

Extracted Key Phrases

4 Figures and Tables

Cite this paper

@article{Lee2013RealTM, title={Real Time Micro-blog Summarization Based on Hadoop/HBase}, author={Sanghoon Lee and Sunny Shakya and Rajshekhar Sunderraman and Saeid Belkasim}, journal={2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)}, year={2013}, volume={3}, pages={46-49} }