Categorical Metadata Representation for Customized Text Classification
@article{Kim2019CategoricalMR, title={Categorical Metadata Representation for Customized Text Classification}, author={Jihyeok Kim and Reinald Kim Amplayo and Kyungjae Lee and Sua Sung and Minji Seo and Seung-won Hwang}, journal={Transactions of the Association for Computational Linguistics}, year={2019}, volume={7}, pages={201-215} }
The performance of text classification has improved tremendously using intelligently engineered neural-based models, especially those injecting categorical metadata as additional information, e.g., using user/product information for sentiment classification. This information has been used to modify parts of the model (e.g., word embeddings, attention mechanisms) such that results can be customized according to the metadata. We observe that current representation methods for categorical metadata…
18 Citations
Rethinking Attribute Representation and Injection for Sentiment Classification
- Computer ScienceEMNLP
- 2019
This paper proposes to represent attributes as chunk-wise importance weight matrices and consider four locations in the model (i.e., embedding, encoding, attention, classifier) to inject attributes, and shows that these representations transfer well to other tasks.
Efficient Strategies for Hierarchical Text Classification: External Knowledge and Auxiliary Tasks
- Computer ScienceACL
- 2020
The combination of the auxiliary task and the additional input of class-definitions significantly enhance the classification accuracy and outperform previous studies, using a drastically reduced number of parameters, in two well-known English datasets.
Minimally Supervised Categorization of Text with Metadata
- Computer ScienceSIGIR
- 2020
MetaCat is proposed, a minimally supervised framework to categorize text with metadata that develops a generative process describing the relationships between words, documents, labels, and metadata and embeds text and metadata into the same semantic space to encode heterogeneous signals.
MATCH: Metadata-Aware Text Classification in A Large Hierarchy
- Computer ScienceWWW
- 2021
This paper presents the MATCH1 solution—an end-to-end framework that leverages both metadata and hierarchy information, and proposes different ways to regularize the parameters and output probability of each child label by its parents.
Efficient Attribute Injection for Pretrained Language Models
- Computer ScienceArXiv
- 2021
This paper proposes a lightweight and memory-efficient method to inject attributes to pretrained language models, and extends adapters, i.e. tiny plug-in feed-forward modules, to include attributes both independently of or jointly with the text.
Hierarchical Metadata-Aware Document Categorization under Weak Supervision
- Computer ScienceWSDM
- 2021
This paper proposes a novel joint representation learning module that allows simultaneous modeling of category dependencies, metadata information and textual semantics, and introduces a data augmentation module that hierarchically synthesizes training documents to complement the original, small-scale training set.
Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text Classification
- Computer ScienceWWW
- 2022
Experimental results show that MICoL significantly outperforms strong zero-shot text classification and contrastive learning baselines and is on par with the state-of-the-art supervised metadata-aware LMTC method trained on 10K–200K labeled documents, and tends to predict more infrequent labels than supervised methods, thus alleviates the deteriorated performance on long-tailed labels.
Beyond Text: Incorporating Metadata and Label Structure for Multi-Label Document Classification using Heterogeneous Graphs
- Computer ScienceEMNLP
- 2021
A novel neural network based approach for multi-label document classification, in which two heterogeneous graphs are constructed and learned using heterogeneous graph transformers, which outperforms several state-of-the-art baselines.
Attribute Injection for Pretrained Language Models: A New Benchmark and an Efficient Method
- Computer ScienceCOLING
- 2022
A benchmark for evaluating attribute injection models is introduced and a lightweight and memory-efficient method to inject attributes into PLMs is proposed, which outperforms previous attribute injection methods and achieves state-of-the-art performance on all datasets.
Speculative text mining for document-level sentiment classification
- Computer ScienceNeurocomputing
- 2020
References
SHOWING 1-10 OF 37 REFERENCES
Learning Semantic Representations of Users and Products for Document Level Sentiment Classification
- Computer ScienceACL
- 2015
By combining evidence at user-, product and documentlevel in a unified neural framework, the proposed model achieves state-of-the-art performances on IMDB and Yelp datasets1.
Neural Sentiment Classification with User and Product Attention
- Computer ScienceEMNLP
- 2016
A hierarchical neural network is proposed to incorporate global user and product information into sentiment classification and achieves significant and consistent improvements compared to all state-of-theart methods.
A General Framework for Personalized Text Classification and Annotation
- Computer ScienceAP WEB 2.0@UMAP
- 2009
The PIRATES framework undertakes a novel approach that automates typical manual tasks such as content annotation and tagging, by means of personalized tags recommendations and other forms of tex- tual annotations (e.g. key-phrases).
Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling
- Computer ScienceCOLING
- 2016
One of the proposed models achieves highest accuracy on Stanford Sentiment Treebank binary classification and fine-grained classification tasks and also utilizes 2D convolution to sample more meaningful information of the matrix.
Cascading Multiway Attentions for Document-level Sentiment Classification
- Computer ScienceIJCNLP
- 2017
A cascading multiway attention (CMA) model is proposed, where multiple ways of using user and product information are cascaded to influence the generation of attentions on the word and sentence layers.
Cold-Start Aware User and Product Attention for Sentiment Classification
- Computer ScienceACL
- 2018
This paper presents Hybrid Contextualized Sentiment Classifier (HCSC), which contains two modules: a fast word encoder that returns word vectors embedded with short and long range dependency features; and Cold-Start Aware Attention (CSAA), an attention mechanism that considers the existence of cold-start problem when attentively pooling the encoded word vectors.
Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions
- Computer ScienceEMNLP
- 2011
A novel machine learning framework based on recursive autoencoders for sentence-level prediction of sentiment label distributions that outperform other state-of-the-art approaches on commonly used datasets, without using any pre-defined sentiment lexica or polarity shifting rules.
Translations as Additional Contexts for Sentence Classification
- Computer ScienceIJCAI
- 2018
This work is the first to use translations as domain-free contexts for sentence classification, and presents multiple context fixing attachment (MCFA), a series of modules attached to multiple sentence vectors to fix the noise in the vectors using the other sentence vectors as context.
Capturing User and Product Information for Document Level Sentiment Analysis with Deep Memory Network
- Computer ScienceEMNLP
- 2017
A deep memory network is proposed for document-level sentiment classification which could capture the user and product information at the same time and can achieve better performance than several existing methods.
Parallel Multi-feature Attention on Neural Sentiment Classification
- Computer Science
- 2017
A novel Parallel Multi-feature Attention (PMA) neural network which concentrates on fine-grained information between user and product level content features and uses multi-feature, user's ranking preference included, to improve the performance of sentiment classification.