Detecting a Tweet's Topic within a Large Number of Portuguese Twitter Trends

Abstract

In this paper we propose to approach the subject of Twitter Topic Detection when in the presence of a large number of trending topics. We use a new technique, called Twitter Topic Fuzzy Fingerprints, and compare it with two popular text classification techniques, Support Vector Machines (SVM) and k-Nearest Neighbours (kNN). Preliminary results show that it outperforms the other two techniques, while still being much faster, which is an essential feature when processing large volumes of streaming data. We focused on a data set of Portuguese language tweets and the respective top trends as indicated by Twitter. 1998 ACM Subject Classification I.2.7 Natural Language Processing, H.2.8 Database Applications, I.5.4 Applications

DOI: 10.4230/OASIcs.SLATE.2014.185

Extracted Key Phrases

5 Figures and Tables

Cite this paper

@inproceedings{Rosa2014DetectingAT, title={Detecting a Tweet's Topic within a Large Number of Portuguese Twitter Trends}, author={Hugo Rosa and Jo{\~a}o Paulo Carvalho and Fernando Batista}, booktitle={SLATE}, year={2014} }