Corpus ID: 108299957

Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors

@article{Zhelezniak2019DontSF,
  title={Don't Settle for Average, Go for the Max: Fuzzy Sets and Max-Pooled Word Vectors},
  author={V. Zhelezniak and A. Savkov and April Shen and Francesco Moramarco and Jack Flann and N. Hammerla},
  journal={ArXiv},
  year={2019},
  volume={abs/1904.13264}
}
  • V. Zhelezniak, A. Savkov, +3 authors N. Hammerla
  • Published 2019
  • Computer Science
  • ArXiv
  • Recent literature suggests that averaged word vectors followed by simple post-processing outperform many deep learning methods on semantic textual similarity tasks. Furthermore, when averaged word vectors are trained supervised on large corpora of paraphrases, they achieve state-of-the-art results on standard STS benchmarks. Inspired by these insights, we push the limits of word embeddings even further. We propose a novel fuzzy bag-of-words (FBoW) representation for text that contains all the… CONTINUE READING
    17 Citations
    Correlations between Word Vector Sets
    • 2
    • PDF
    Correlation Coefficients and Semantic Textual Similarity
    • 14
    • PDF
    Structural-Aware Sentence Similarity with Recursive Optimal Transport
    • 1
    • PDF
    Word Rotator's Distance: Decomposing Vectors Gives Better Representations
    • Highly Influenced
    Word Rotator's Distance
    • Highly Influenced
    • PDF
    SEAGLE: A Platform for Comparative Evaluation of Semantic Encoders for Information Retrieval
    • 1
    • Highly Influenced
    • PDF
    Windowing Models for Abstractive Summarization of Long Texts
    • PDF
    Model Comparison for Semantic Grouping
    • 1
    • PDF
    Can Embeddings Adequately Represent Medical Terminology? New Large-Scale Medical Term Similarity Datasets Have the Answer!
    • 3
    • Highly Influenced
    • PDF

    References

    SHOWING 1-10 OF 71 REFERENCES
    All-but-the-Top: Simple and Effective Postprocessing for Word Representations
    • 100
    • PDF
    A Simple but Tough-to-Beat Baseline for Sentence Embeddings
    • 742
    • Highly Influential
    Glove: Global Vectors for Word Representation
    • 16,746
    • Highly Influential
    • PDF
    Enriching Word Vectors with Subword Information
    • 4,453
    • Highly Influential
    • PDF
    Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms
    • 169
    • PDF
    Distributed Representations of Sentences and Documents
    • 5,719
    • PDF
    Representation learning for very short texts using weighted word embedding aggregation
    • 135
    • PDF
    From Word Embeddings To Document Distances
    • 1,154
    • Highly Influential
    • PDF
    Unsupervised Random Walk Sentence Embeddings: A Strong but Simple Baseline
    • 35
    • PDF
    Linguistic Regularities in Continuous Space Word Representations
    • 2,661
    • PDF