• Corpus ID: 233169058

HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks

@inproceedings{Alam2021HumAIDHD,
  title={HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks},
  author={Firoj Alam and Umair Yaqub Qazi and Muhammad Imran and Ferda Ofli},
  booktitle={ICWSM},
  year={2021}
}
Social networks are widely used for information consumption and dissemination, especially during time-critical events such as natural disasters. Despite its significantly large volume, social media content is often too noisy for direct use in any application. Therefore, it is important to filter, categorize, and concisely summarize the available content to facilitate effective consumption and decision-making. To address such issues automatic classification systems have been developed using… 

Figures and Tables from this paper

CrisisBench: Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing
TLDR
This work combines eight human-annotated datasets and provides benchmarks for both binary and multiclass classification tasks using several deep learning architecrures including, CNN, fastText, and transformers to help train more sophisticated models.
Extraction and analysis of natural disaster-related VGI from social media: review, opportunities and challenges
TLDR
In this review, eight common tasks and their solutions in social media content analysis for natural disasters are summarized and grouped and analyzed studies that make further use of this extracted information, either standalone or in combination with other sources.
EnDSUM: Entropy and Diversity based Disaster Tweet Summarization
TLDR
This paper proposes an entropy and diversity based summarizer, termed as EnDSUM, specifically for disaster tweet summarization, and comprehensive analysis on 6 datasets indicates the effectiveness and additionally, highlights the scope of improvement of En DSUM.
OntoRealSumm : Ontology based Real-Time Tweet Summarization
TLDR
An ontology based real-time tweet summarization for disasters which generates a summary of the tweets related to disasters with minimum human intervention and comparing the performance with state-of-the-art techniques on 10 disaster datasets validates the effectiveness of OntoRealSumm.
A Sentiment-Aware Contextual Model for Real-Time Disaster Prediction Using Twitter Data
TLDR
A sentiment-aware contextual model named SentiBERT-BiLSTM-CNN for disaster detection using Tweets demonstrates superior performance in the F1 score, making it a competitive model in Tweets-based disaster prediction.
Continual Distributed Learning for Crisis Management
TLDR
This work uses regularisation to alleviate catastrophic forgetting in the target neural networks while taking a distributed approach to enable learning on resource-constrained devices and employs federated learning for distributed learning and aggregation of the central model for continual deployment.

References

SHOWING 1-10 OF 52 REFERENCES
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
TLDR
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines
This paper proposes a new algorithm for training support vector machines: Sequential Minimal Optimization, or SMO. Training a support vector machine requires the solution of a very large quadratic
CrisisBench: Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing
TLDR
This work combines eight human-annotated datasets and provides benchmarks for both binary and multiclass classification tasks using several deep learning architecrures including, CNN, fastText, and transformers to help train more sophisticated models.
Social Media Images Classification Models for Real-time Disaster Response
TLDR
Ten different architectures for four different tasks using the largest publicly available datasets for these tasks are investigated, including detecting crisis incidents, filtering irrelevant images, classifying images into specific humanitarian categories, and assessing the severity of the damage.
AIDR: artificial intelligence for disaster response
TLDR
AIDR has been successfully tested to classify informative vs. non-informative tweets posted during the 2013 Pakistan Earthquake and achieved a classification quality (measured using AUC) of 80%.
RoBERTa: A Robustly Optimized BERT Pretraining Approach
TLDR
It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.
Big (Crisis) Data
The measurement of observer agreement for categorical data.
TLDR
A general statistical methodology for the analysis of multivariate categorical data arising from observer reliability studies is presented and tests for interobserver bias are presented in terms of first-order marginal homogeneity and measures of interob server agreement are developed as generalized kappa-type statistics.
Bag of Tricks for Efficient Text Classification
TLDR
A simple and efficient baseline for text classification is explored that shows that the fast text classifier fastText is often on par with deep learning classifiers in terms of accuracy, and many orders of magnitude faster for training and evaluation.
CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises
TLDR
The EPFL-CONF-203561 study highlights the need to understand more fully the role of social media in the decision-making process and the role that media outlets play in this process.
...
...