BertGCN: Transductive Text Classification by Combining GNN and BERT

@article{Lin2021BertGCNTT,
  title={BertGCN: Transductive Text Classification by Combining GNN and BERT},
  author={Yuxiao Lin and Yuxian Meng and Xiaofei Sun and Qinghong Han and Kun Kuang and Jiwei Li and Fei Wu},
  journal={ArXiv},
  year={2021},
  volume={abs/2105.05727}
}
In this work, we propose BertGCN, a model that combines large scale pretraining and transductive learning for text classification. BertGCN constructs a heterogeneous graph over the dataset and represents documents as nodes using BERT representations. By jointly training the BERT and GCN modules within BertGCN, the proposed model is able to leverage the advantages of both worlds: large-scale pretraining which takes the advantage of the massive amount of raw data and transductive learning which… 

Figures and Tables from this paper

InducT-GCN: Inductive Graph Convolutional Networks for Text Classification

This paper introduces a novel inductive graph-based text classification framework, InducT-GCN (InducTive Graph Convolutional Networks for Text classification), which outperformed state-of-the-art methods that are either transductive in nature or pre-trained additional resources.

Chinese short text classification by combining Bert and graph convolutional network

This work proposes using a large-scale pre-trained language model Bert to obtain feature information between words and bureaus in the text, Graph Convolutional Network (GCN) with double-layer convolutional network can obtain the dependency relationships between words.

CNN-Trans-Enc: A CNN-Enhanced Transformer-Encoder On Top Of Static BERT representations for Document Classification

This work proposes a CNN-Enhanced Transformer-Encoder model which is trained on top of BERT [ CLS ] representations from all layers, employing Convolutional Neural Networks to generate QKV feature maps inside the Transformer -Encoder, instead of linear projections of the input into the embed- ding space.

Simplified-Boosting Ensemble Convolutional Network for Text Classification

An ensemble convolutional network is proposed by combining GCN and CNN, which catches the global information and CNN extracts local features and achieves better performance than other state-of-the-art methods with less memory.

BHGAttN: A Feature-Enhanced Hierarchical Graph Attention Network for Sentiment Analysis

A Bert-based hierarchical graph attention network model (BHGAttN) based on a large-scale pretrained model andgraph attention network to model the hierarchical relationship of texts and exhibits significant competitive advantages compared with the current state-of-the-art baseline models.

ConTextING: Granting Document-Wise Contextual Embeddings to Graph Neural Networks for Inductive Text Classification

This work proposes a simple yet effective unified model, coined ConTextING, with a joint training mechanism to learn from both document embeddings and contextual word interactions simultaneously, which outperforms pure inductive GNNs and BERT-style models.

Text Representation Enrichment Utilizing Graph based Approaches: Stock Market Technical Analysis Case Study

A transductive hybrid approach composed of an unsupervised node representation learning model followed by a node classification/Edge prediction model is proposed which is capable of processing heterogeneous graphs to produce unified node embeddings which are then utilized for node classification or link prediction as the downstream task.

BHGAN: A Hierarchical Graph Network Based on Feature Enhancement

A hierarchical graph network model BHGAN based on BERT and utilizing graph attention network to model the hierarchical relationship of texts is proposed and exhibits competitive results on the dataset.

GNNer: Reducing Overlapping in Span-based NER Using Graph Neural Networks

This work proposes GNNer, a framework that uses Graph Neural Networks to enrich the span representation to reduce the number of overlapping spans during prediction to maintain competitive metric performance.

Active Learning Strategy for COVID-19 Annotated Dataset

A novel discriminative batch-mode active learning (DS3) is proposed to allow faster and more effective COVID-19 data annotation and the results of significance testing verify the effectiveness of DS3 and its superiority over baseline active learning algorithms.
...

References

SHOWING 1-10 OF 48 REFERENCES

PTE: Predictive Text Embedding through Large-scale Heterogeneous Text Networks

A semi-supervised representation learning method for text data, which is called the predictive text embedding (PTE), which is comparable or more effective, much more efficient, and has fewer parameters to tune.

Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks

Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity is presented.

GILE: A Generalized Input-Label Embedding for Text Classification

This paper proposes a new input-label model that generalizes over previous such models, addresses their limitations, and does not compromise performance on seen labels and outperforms monolingual and multilingual models that do not leverage label semantics and previous joint input- label space models in both scenarios.

Uncertainty-aware Self-training for Text Classification with Few Labels

This work proposes an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network leveraging recent advances in Bayesian deep learning and proposes acquisition functions to select instances from the unlabeled pool leveraging Monte Carlo (MC) Dropout.

Enhancing Generalization in Natural Language Inference by Syntax

This work investigates the use of dependency trees to enhance the generalization of BERT in the NLI task, leveraging on a graph convolutional network to represent a syntax-based matching graph with heterogeneous matching patterns.

Zero-shot Text Classification via Reinforced Self-training

This paper proposes a reinforcement learning framework to learn data selection strategy automatically and provide more reliable selection, and significantly outperforms previous methods in zero-shot text classification.

Joint Embedding of Words and Labels for Text Classification

This work proposes to view text classification as a label-word joint embedding problem: each label is embedded in the same space with the word vectors, and introduces an attention framework that measures the compatibility of embeddings between text sequences and labels.

Graph Attention Networks

We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior

Graph Convolutional Networks for Text Classification

This work builds a single text graph for a corpus based on word co-occurrence and document word relations, then learns a Text Graph Convolutional Network (Text GCN) for the corpus, which jointly learns the embeddings for both words and documents as supervised by the known class labels for documents.

Description Based Text Classification with Reinforcement Learning

A new framework for text classification is proposed, in which each category label is associated with a category description, which forces the model to attend to the most salient texts with respect to the label, which can be regarded as a hard version of attention, leading to better performances.