Lifelong Learning Natural Language Processing Approach for Multilingual Data Classification

  title={Lifelong Learning Natural Language Processing Approach for Multilingual Data Classification},
  author={Jkedrzej Kozal and Michal Le's and Paweł Zyblewski and Paweł Ksieniewicz and Michal Wo'zniak},
. The abundance of information in digital which in to-day’s world is the main source of knowledge about current events for the masses, makes it possible to spread disinformation on a larger scale than ever before. Consequently, there is a need to develop novel fake news detection approaches capable of adapting to changing factual contexts and generalizing previously or concurrently acquired knowledge. To deal with this problem, we propose a lifelong learning-inspired approach, which allows for… 

Figures and Tables from this paper



Supervised Learning of Universal Sentence Representations from Natural Language Inference Data

It is shown how universal sentence representations trained using the supervised data of the Stanford Natural Language Inference datasets can consistently outperform unsupervised methods like SkipThought vectors on a wide range of transfer tasks.

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.

Fighting post-truth using natural language processing: A review and open challenges

Automating fake news detection system using multi-level voting model

The goal is to find out which classification model identifies phony features accurately using three feature extraction techniques, Term Frequency–Inverse Document Frequency (TF–IDF), Count-Vectorizer (CV) and Hashing- Vectorizer (HV), and a novel multi-level voting ensemble model is proposed.

Anempirical Analysis of Classification Models for Detection of Fake News Articles

Investigating whether automatic computational approaches in NLP and Machine Learning can be used to detect falsehoods in written text and the impact of certain changes in the parameters of feature extraction on classifier performance is analysed.

Tree LSTMs with Convolution Units to Predict Stance and Rumor Veracity in Social Media Conversations

A new way to represent social-media conversations as binarized constituency trees that allows comparing features in source-posts and their replies effectively and to use convolution units in Tree LSTMs that are better at learning patterns in features obtained from the source and reply posts is proposed.

PHEME dataset of rumours and non-rumours

A novel approach to rumour detection that learns from the sequential dynamics of reporting during breaking news in social media to detect rumours in new stories and achieves competitive performance, beating the state-of-the-art classifier that relies on querying tweets with improved precision and recall, as well as outperforming the best baseline.

Credibility Detection in Twitter Using Word N-gram Analysis and Supervised Machine Learning Techniques

A classification model based on supervised machine learning techniques and word-based N-gram analysis to classify Twitter messages automatically into credible and not credible and experiments show that the proposed model achieved an improvement when compared to two models existing in the literature.

Content Based Fake News Detection Using N-Gram Models

This paper proposes the fake news detection system that considers the content of the online news articles and investigates two machine learning algorithms with the use of word n-grams and character n- grams analysis.