Reducing the Dimensionality of Bag-of-Words Text Representation Used by Learning Algorithms

@inproceedings{Martins2003ReducingTD,
  title={Reducing the Dimensionality of Bag-of-Words Text Representation Used by Learning Algorithms},
  author={Claudia Aparecida Martins},
  year={2003}
}
The attribute-value representation of documents used in Text Mining provides a natural framework for classifying or clustering documents based on their content. Supervised learning algorithms can be applied whenever the documents have labels preassigned or unsupervised learning for unlabeled documents. The attribute-value representation of documents is characterized by very high dimensional data since every word in the document may be treated as an attribute. However, the representation of… CONTINUE READING

Citations

Publications citing this paper.
Showing 1-8 of 8 extracted citations

Dimensionality Reduction Using Semantic Analysis

ENAS SEDKI, ABDELFATTAH ALZAQAH, Arafat Awajan
2015
View 3 Excerpts
Highly Influenced

Mining similar radiology reports using BoW and Fuzzy C-means clustering

2017 International Artificial Intelligence and Data Processing Symposium (IDAP) • 2017
View 1 Excerpt

A Comparison Between Supervised and Unsupervised Models for Identify a Large Number of Categories

2016 IEEE 17th International Conference on Information Reuse and Integration (IRI) • 2016
View 1 Excerpt

Personification of Bag-of-Features Dataset for Real Time Activity Recognition

2016 3rd International Conference on Soft Computing & Machine Intelligence (ISCMI) • 2016
View 2 Excerpts

Hyperspectral image analysis based on BoSW model for rice panicle blast grading

Computers and Electronics in Agriculture • 2015
View 1 Excerpt

Vulnerability identification and classification via text mining bug databases

IECON 2014 - 40th Annual Conference of the IEEE Industrial Electronics Society • 2014
View 1 Excerpt

References

Publications referenced by this paper.
Showing 1-10 of 20 references

Architecture and implementation description of the computational environment DISCOVER LERNING ENVIRONMENT - DLE

Batista, A G.E.A.P., M. C. Monard
Technical Report 187, • 2003

Data preprocessing in supervised machine learning

Batista, A G.E.A.P.
PhD Thesis, ICMC-USP (in Portuguese) • 2003

The design and use of the PreTexT tool for text processing

E. T. Matsubara, C. A. Martins, M. C. Monard
2003

The integration framework of the DISCOVER system

F. Sebastiani
Master Thesis , ICMC - USP ( in Portuguese ) Data Mining Tools S ee 5 and C 5 . 0 • 2003

april). The integration framework of the DISCOVER system. Master Thesis, ICMC-USP (in Portuguese)

R. C. Prati
2003
View 1 Excerpt

chitecture and implementation description of the computational environment DISCOVER LERNING ENVIRONMENT - DLE

G. E. A. P. A. Batista
2003
View 1 Excerpt

Machine learning in automated text categorisation

Sebastiani, March
ACM Computing Surveys • 2002

Similar Papers

Loading similar papers…