Corpus ID: 236428147

IRLCov19: A Large COVID-19 Multilingual Twitter Dataset of Indian Regional Languages

  title={IRLCov19: A Large COVID-19 Multilingual Twitter Dataset of Indian Regional Languages},
  author={D. Uniyal and Amit Agarwal},
  • D. Uniyal, Amit Agarwal
  • Published 2021
  • Computer Science
  • ArXiv
Emerged in Wuhan city of China in December 2019, COVID19 continues to spread rapidly across the world despite authorities having made available a number of vaccines. While the coronavirus has been around for a significant period of time, people and authorities still feel the need for awareness due to the mutating nature of the virus and therefore varying symptoms and prevention strategies. People and authorities resort to social media platforms the most to share awareness information and voice… Expand

Figures and Tables from this paper


NAIST COVID: Multilingual COVID-19 Twitter and Weibo Dataset
This paper releases a multilingual dataset of social media posts related to COVID-19, consisting of microblogs in English and Japanese from Twitter and those in Chinese from Weibo, and provides a quantitative as well as qualitative analysis of these datasets by creating daily word clouds as an example of text-mining analysis. Expand
Large Arabic Twitter Dataset on COVID-19
This work describes the first Arabic tweets dataset on COVID-19 that it has been collecting since January 1st, 2020 and would help researchers and policy makers in studying different societal issues related to the pandemic. Expand
A First Instagram Dataset on COVID-19
A multilingual coronavirus (COVID-19) Instagram dataset that has been continuously collected since March 30, 2020 is provided to help the community to better understand the dynamics behind this phenomenon in Instagram, as one of the major social media. Expand
Dense Vector Embedding Based Approach to Identify Prominent Disseminators From Twitter Data Amid COVID-19 Outbreak
It is concluded that data generated broadly fall into information and prevention categories, whereas the print media, politicians, and health organizations are the forerunners of the selected prominent disseminators. Expand
Dataset on dynamics of Coronavirus on Twitter
A dataset of 8,982,694 Twitter posts around the coronavirus health global crisis, which includes a new variable created from other four variables; it is called “type” of tweets, which is useful for showing the diversity of tweets and the dynamics of users on Twitter. Expand
Coronavirus Goes Viral: Quantifying the COVID-19 Misinformation Epidemic on Twitter
An early quantification of the magnitude of misinformation spread is provided and the importance of early interventions in order to curb this phenomenon that endangers public safety at a time when awareness and appropriate preventive actions are paramount is highlighted. Expand
Using Social Media to Mine and Analyze Public Opinion Related to COVID-19 in China
Public opinion in the early stages of COVID-19 in China is explored by analyzing Sina-Weibo texts in terms of space, time, and content to better understand the public opinion and sentiments towards CO VID-19, to accelerate emergency responses, and to support post-disaster management. Expand
GeoCoV19: A Dataset of Hundreds of Millions of Multilingual COVID-19 Tweets with Location Information
GeoCoV19, a large-scale Twitter dataset containing more than 524 million multilingual tweets posted over a period of 90 days since February 1, 2020, is presented and it is postulate that this large- scale, multilingual, geolocated social media data can empower the research communities to evaluate how societies are collectively coping with this unprecedented global crisis. Expand
Social media influence in the COVID-19 Pandemic
The most relevant information on the influence, and advantages, and disadvantages of the use of social networks during the COVID-19 pandemic is summarized. Expand
Characterizing the Propagation of Situational Information in Social Media During COVID-19 Epidemic: A Case Study on Weibo
This article sought to fill the gap by harnessing Weibo data and natural language processing techniques to classify the COVID-19-related information into seven types of situational information and found specific features in predicting the reposted amount of each type of information. Expand