Corpus ID: 212725845

TF-IDFC-RF: A Novel Supervised Term Weighting Scheme

@article{Carvalho2020TFIDFCRFAN,
  title={TF-IDFC-RF: A Novel Supervised Term Weighting Scheme},
  author={Flavio Carvalho and Gustavo Paiva Guedes},
  journal={ArXiv},
  year={2020},
  volume={abs/2003.07193}
}
Sentiment Analysis is a branch of Affective Computing usually considered a binary classification task. In this line of reasoning, Sentiment Analysis can be applied in several contexts to classify the attitude expressed in text samples, for example, movie reviews, sarcasm, among others. A common approach to represent text samples is the use of the Vector Space Model to compute numerical feature vectors consisting of the weight of terms. The most popular term weighting scheme is TF-IDF (Term… Expand
A Study on Text Classification: Term Weighting Algorithm Analysis
With the advancement of digital recording and storing technology, plus the huge growth of world wide web, people nowadays use digital texts instead of paper to write and record. In order to realizeExpand
Text based personality prediction from multiple social media data sources using pre-trained language model and model averaging
TLDR
A new prediction using multi model deep learning architecture combined with multiple pre-trained language model such as BERT, RoBERTa, and XLNet as features extraction method on social media data sources to produce a predictive model for each trait using bidirectional context feature combine with extraction method. Expand

References

SHOWING 1-10 OF 42 REFERENCES
A Comparison of Term Weighting Schemes for Text Classification and Sentiment Analysis with a Supervised Variant of tf.idf
TLDR
A supervised variant of the well-known tf.idf scheme is proposed, where the idf factor is computed without considering documents within the category under analysis, so that terms frequently appearing only within it are not penalized. Expand
A study of supervised term weighting scheme for sentiment analysis
TLDR
This scheme use ITD and ITS to measure the importance of a term in sentiment analysis to improve the performance of analysis and outperforms state-of-the-art unsupervised approaches. Expand
A Supervised Term Weighting Scheme for Multi-class Text Categorization
TLDR
This paper proposed a new supervised term weighting scheme for multi-class text categorization that has the best result in classification accuracy compared with other existing methods and has a built-in property to prevent over-weighting in STW. Expand
Turning from TF-IDF to TF-IGM for term weighting in text classification
TLDR
Experimental results show that TF-IGM outperforms the famous TF-IDF and the state-of-the-art supervised term weighting schemes and some new findings different from previous studies are obtained and analyzed in depth in the paper. Expand
On Term Frequency Factor in Supervised Term Weighting Schemes for Text Classification
The performance of text classification can be affected by the choice of appropriate term weighting scheme as well as other parameters. The terminology supervised term weighting scheme has becomeExpand
Improved inverse gravity moment term weighting for text classification
TLDR
Two novel term weighting schemes namely SQRT_TF-igMimp and TF-IGMimp derived from standard inverse gravity moment formula are proposed to improve weighting behaviors of existing TF- IGM scheme especially for some extreme cases. Expand
A Study on Term Weighting for Text Categorization: A Novel Supervised Variant of tf.idf
TLDR
A supervised variant of the tf.idf scheme is proposed, based on computing the usual idf factor without considering documents of the category to be recognized, so that importance of terms appearing only within it is not underestimated. Expand
Supervised and Traditional Term Weighting Methods for Automatic Text Categorization
  • Man Lan, C. Tan, J. Su, Yue Lu
  • Computer Science, Medicine
  • IEEE Transactions on Pattern Analysis and Machine Intelligence
  • 2009
TLDR
This study investigates several widely-used unsupervised and supervised term weighting methods on benchmark data collections in combination with SVM and kNN algorithms and proposes a new simple supervisedterm weighting method, tf.rf, to improve the terms' discriminating power for text categorization task. Expand
Modified frequency-based term weighting schemes for text classification
With the rapid growth of textual content on the Internet, automatic text categorization is a comparatively more effective solution in information organization and knowledge management. FeatureExpand
Supervised term weighting for automated text categorization
TLDR
It is proposed that learning from training data should also affect phase (ii), i.e. that information on the membership of training documents to categories be used to determine term weights, and is called supervised term weighting (STW). Expand
...
1
2
3
4
5
...