• Corpus ID: 212595535

Tweet Analysis: Twitter Data processing Using Apache Hadoop

@inproceedings{Danthala2015TweetAT,
  title={Tweet Analysis: Twitter Data processing Using Apache Hadoop},
  author={Manoj Kumar Danthala},
  year={2015}
}
BIG DATA’ has been getting much importance in different industries over the last year or two, on a scale that has generated lots of data every day. Big Data is a term applied to data sets of very large size such that the traditional databases are unable to process their operations in a reasonable amount of time. It has tremendous potential to transform business and power in several ways. Here the challenge is not only storing the data, but also accessing and analyzing the required data in… 

Figures from this paper

Opinion Mining of Twitter Data using Hadoop and Apache Pig
TLDR
This paper provides an efficient mechanism to perform opinion mining by coming up with a finish to finish pipeline with the assistance of Apache Flume, Apache HDFS, and Apache Pig.
REAL-TIME OPINION MINING OF TWITTER DATA USING FLUME AND HADOOP
TLDR
This paper proposed hadoop which is an open source framework used to stored and process huge amount of structured and unstructured data and also proposed various hadoop ecosystems which is used to fetch real time tweets and analyzing these data in efficient manner.
SOCIAL NETWORKSBASED DISEASE ANALYSIS USING HADOOP
TLDR
This paper aims to analyse social media (Twitter,etc.) big data to identify the widespread of certain keywords related to different diseases at various locations such as India, TamilNadu and World.
Twitter data analysis using hadoop ecosystems and apache zeppelin
TLDR
The location from where the tweets is posted and the language in which the tweets are written can be effectively analysed by using Hadoop, a tool used to analyze distributed big data, streaming data, timestamp data and text data.
Location based Analysis of Twitter Data using Apache Hive
TLDR
This paper discusses how to use FLUME for extracting twitter data and store it into HDFS for analysis, and after that it is use apache hive for analysing these data.
Business Improvement Approach Based on Sentiment Twitter Analysis: Case Study
TLDR
The reason why big data technologies need to be adopted is demonstrated, and different steps require to collect, store, process and analyse twitter data in large scale using different big data platforms and software are presented.
An adaptive clustering and classification algorithm for Twitter data streaming in Apache Spark
TLDR
The presented adaptive clustering and classification algorithm is used for data streaming in Apache Spark to overcome the existing problems and exhibit the superiority of presented approach comparing with the existing methods in terms of precision, recall, F-score, convergence, ROC curve and accuracy.
Opinion Mining of Twitter Data using Hive
TLDR
This paper provides an efficient mechanism to perform opinion mining by coming up with a finish to finish pipeline with the assistance of Apache Flume ,Apache HDFS, and Apache Hive.
Studi Perbandingan Performa Algoritma Penjadwalan untuk Real Time Data Twitter pada Hadoop
TLDR
This study aims to compare the performance of the two schedulers for Twitter data characteristics and shows the Hadoop Fair Sojourn Protocol Scheduler has a better performance than the Fair Scheduler both from handling average completion time and job throughput.
Survey: Sentiment Analysis of Twitter Data for Stock Market Prediction
TLDR
The different techniques can be used to classify result of sentiment score and technology that speedup the computation which will improve the performance is discussed.
...
...

References

SHOWING 1-7 OF 7 REFERENCES
Driving big data with big compute
TLDR
The LLGrid team has developed and deployed a number of technologies that aim to provide the best of both worlds, including LLGrid MapReduce, which allows the map/reduce parallel programming model to be used quickly and efficiently in any language on any compute cluster.
The Hadoop Distributed File System
TLDR
The architecture of HDFS is described and experience using HDFS to manage 25 petabytes of enterprise data at Yahoo! is reported on.
Understanding big data
TLDR
This website will show you the understanding big data that will be your best choice for better reading book and you will not spend wasted by reading this website.
The Hadoop Distributed Filesystem
  • Hadoop: The Definitive Guide, pp. 41-73, GravensteinHighwaNorth, Sebastopol: O’Reilly Media, Inc., 2010.
  • 2010
A, Effective Sentiment Analysis on Twitter Data using: Apache Flume and Hive
  • Computer Science and EngineeringDept
Understanding the Big Data problems and their solutions using Hadoop MapReduce
    Zikopoulos , Chris Eaton , Dirk deRoos “ Understanding Big Data ”
    • . GSuresh Babu . A , Effective Sentiment Analysis on Twitter Data using : Apache Flume and Hive , Computer Science and EngineeringDept , JNTUACEP , Pulivendula , Vol . 1 Issue 8 , October 2014 . [ 3 ] Mr . Swapnil A . Kale , Prof . Sangram S . Dandge , Understanding the Big Data problems and their s