Quasi-orthonormal Encoding for Machine Learning Applications

@article{Lu2020QuasiorthonormalEF,
  title={Quasi-orthonormal Encoding for Machine Learning Applications},
  author={Haw-minn Lu},
  journal={ArXiv},
  year={2020},
  volume={abs/2006.00038}
}
  • Haw-minn Lu
  • Published 2020
  • Mathematics, Computer Science
  • ArXiv
  • Most machine learning models, especially artificial neural networks, require numerical, not categorical data. We briefly describe the advantages and disadvantages of common encoding schemes. For example, one-hot encoding is commonly used for attributes with a few unrelated categories and word embeddings for attributes with many related categories (e.g., words). Neither is suitable for encoding attributes with many unrelated categories, such as diagnosis codes in healthcare applications… CONTINUE READING

    References

    Publications referenced by this paper.
    SHOWING 1-10 OF 19 REFERENCES

    Smith

    • N.J.A. Sloane, R. H. Hardin, D W.
    • Spherical codes: Nice arrangements of points on a sphere in various dimensions,
    • 2020
    VIEW 4 EXCERPTS
    HIGHLY INFLUENTIAL

    A geometric method to obtain error-correcting classification by neural networks with fewer hidden units

    VIEW 3 EXCERPTS

    A spherical code

    VIEW 1 EXCERPT

    Category encoders

    • Will Mcginnis
    • 2016

    Curse of dimensionality — Wikipedia

    • Wikipedia contributors
    • the free encyclopedia,
    • 2020
    VIEW 1 EXCERPT

    Don’t be tricked by the hashing trick

    • Lucas Bernardi
    • Jan
    • 2018
    VIEW 1 EXCERPT

    MIDA: Multiple Imputation Using Denoising Autoencoders

    VIEW 2 EXCERPTS

    MNIST handwritten digit database

    • Yann Lecun, Corinna Cortes
    • 2010