Compression and machine learning: a new perspective on feature space vectors

@article{Sculley2006CompressionAM,
  title={Compression and machine learning: a new perspective on feature space vectors},
  author={D. Sculley and C. Brodley},
  journal={Data Compression Conference (DCC'06)},
  year={2006},
  pages={332-341}
}
  • D. Sculley, C. Brodley
  • Published 2006
  • Computer Science
  • Data Compression Conference (DCC'06)
  • The use of compression algorithms in machine learning tasks such as clustering and classification has appeared in a variety of fields, sometimes with the promise of reducing problems of explicit feature selection. [...] Key Result To underscore this point, we find theoretical and empirical connections between traditional machine learning vector models and compression, encouraging cross-fertilization in future workExpand Abstract
    109 Citations
    An investigation of implicit features in compression-based learning for comparing webpages
    • 1
    • Highly Influenced
    Text Mining Using Data Compression Models
    • 3
    • PDF
    Compression-Based Data Mining
    • 9
    • PDF
    Compressive Feature Learning
    • 14
    • PDF
    An Efficient Algorithm for Large Scale Compressive Feature Learning
    • 12
    • PDF
    Text Classification Using Compression-Based Dissimilarity Measures
    • 15
    Text Classification with Compression Algorithms
    • 2
    • PDF
    Verification based on Compression-Models
    PyLZJD: An Easy to Use Tool for Machine Learning
    • 3
    • PDF
    Construction of Efficient V-Gram Dictionary for Sequential Data Analysis
    • 1
    • Highly Influenced

    References

    SHOWING 1-10 OF 36 REFERENCES
    Clustering by compression
    • 745
    • Highly Influential
    • PDF
    Text categorization using compression models
    • 99
    • Highly Influential
    • PDF
    The similarity metric
    • 1,127
    • Highly Influential
    • PDF
    Introduction to Information Theory and Data Compression
    • 149
    Spam Filtering Using Compression Models
    • 21
    • PDF
    Text mining: a new frontier for lossless compression
    • 79
    • Highly Influential
    Kernel Methods for Pattern Analysis
    • 3,742
    • PDF
    A repetition based measure for verification of text collections and for text categorization
    • 94
    • PDF
    Towards parameter-free data mining
    • 567
    • Highly Influential
    • PDF
    Data Compression Using Adaptive Coding and Partial String Matching
    • 1,216
    • PDF