AIR-JPMC@SMM4H’22: Classifying Self-Reported Intimate Partner Violence in Tweets with Multiple BERT-based Models

@inproceedings{Candidato2022AIRJPMCSMM4H22CS,
  title={AIR-JPMC@SMM4H’22: Classifying Self-Reported Intimate Partner Violence in Tweets with Multiple BERT-based Models},
  author={Alec Candidato and Akshat Gupta and Xiaomo Liu and Sameena Shah},
  booktitle={SMM4H},
  year={2022}
}
This paper presents our submission for the SMM4H 2022-Shared Task on the classification of self-reported intimate partner violence on Twitter (in English). The goal of this task was to accurately determine if the contents of a given tweet demonstrated someone reporting their own experience with intimate partner violence. The submitted system is an ensemble of five RoBERTa models each weighted by their respective F1-scores on the validation data-set. This system performed 13% better than the… 
1 Citations

Figures and Tables from this paper

Overview of the Seventh Social Media Mining for Health Applications (#SMM4H) Shared Tasks at COLING 2022

An overview of the shared tasks and participants’ systems is provided to promote the community-driven development and evaluation of advanced natural language processing systems to detect, extract, and normalize health-related information in public, user-generated content.

References

SHOWING 1-6 OF 6 REFERENCES

Natural Language Model for Automatic Identification of Intimate Partner Violence Reports from Twitter

An effective NLP model can be an essential component for providing proactive social media based intervention and support for victims and may also be used for population-level surveillance and conducting large-scale cohort studies.

Overview of the Sixth Social Media Mining for Health Applications (#SMM4H) Shared Tasks at NAACL 2021

The Social Media Mining for Health Applications (#SMM4H) shared tasks in its sixth iteration sought to advance the use of social media texts such as Twitter for pharmacovigilance, disease tracking and patient centered outcomes.

SJ_AJ@DravidianLangTech-EACL2021: Task-Adaptive Pre-Training of Multilingual BERT models for Offensive Language Identification

This final system is an ensemble of mBERT and XLM-RoBERTa models which leverage task-adaptive pre-training of multilingual BERT models with a masked language modeling objective and was ranked 1st for Kannada, 2nd for Malayalam and 3rd for Tamil.

Automatic Identification of Narrative Diegesis and Point of View

This work develops automatic classifiers for point of view and diegesis in text, and compares the performance of different feature sets for both, and applies the classifier to nearly 40,000 news texts across five different corpora comprising multiple genres.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

RoBERTa: A Robustly Optimized BERT Pretraining Approach

It is found that BERT was significantly undertrained, and can match or exceed the performance of every model published after it, and the best model achieves state-of-the-art results on GLUE, RACE and SQuAD.