Text Classification Using Compression-Based Dissimilarity Measures

Arguably, the most di±cult task in text classi ̄cation is to choose an appropriate set of features that allows machine learning algorithms to provide accurate classi ̄cation. Most state-of-the-art techniques for this task involve careful feature engineering and a pre-processing stage, which may be too expensive in the emerging context of massive collections… CONTINUE READING

6 Figures & Tables



Citations per Year

Citation Velocity: 5

Averaging 5 citations per year over the last 3 years.

Learn more about how we calculate this metric in our FAQ.