Parallelizing Word2Vec in Shared and Distributed Memory

  title={Parallelizing Word2Vec in Shared and Distributed Memory},
  author={Shihao Ji and Nadathur Satish and Sheng Li and P. Dubey},
  journal={IEEE Transactions on Parallel and Distributed Systems},
  • Shihao Ji, Nadathur Satish, +1 author P. Dubey
  • Published 2019
  • Computer Science, Mathematics
  • IEEE Transactions on Parallel and Distributed Systems
  • Word2vec is a widely used algorithm for extracting low-dimensional vector representations of words. [...] Key Method We also explore different techniques to distribute word2vec computation across nodes in a computer cluster, and demonstrate good strong scalability up to 32 nodes. The new algorithm is particularly suitable for modern multi-core/many-core architectures, especially Intel's latest Knights Landing processors, and allows us to scale up the computation near linearly across cores and nodes, and process…Expand Abstract
    On Sampling Strategies for Neural Network-based Collaborative Filtering
    • 42
    • Open Access
    The Geometry of Culture: Analyzing the Meanings of Class through Word Embeddings
    • 50
    • Open Access
    NetSMF: Large-Scale Network Embedding as Sparse Matrix Factorization
    • 26
    • Open Access
    Asynchronous Training of Word Embeddings for Large Text Corpora
    • 2
    • Open Access
    Scalable Prediction of Global Online Media News Virality
    • 4
    • Open Access
    FPGA-Based Acceleration of Word2vec using OpenCL
    • 2


    Publications referenced by this paper.
    Distributed Representations of Words and Phrases and their Compositionality
    • 18,690
    • Highly Influential
    • Open Access
    Efficient Estimation of Word Representations in Vector Space
    • 14,973
    • Highly Influential
    • Open Access
    Sequence to Sequence Learning with Neural Networks
    • 10,437
    • Open Access
    Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
    • 9,395
    • Open Access
    Effective Approaches to Attention-based Neural Machine Translation
    • 3,873
    • Open Access
    A unified architecture for natural language processing: deep neural networks with multitask learning
    • 4,011
    • Open Access
    Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent
    • 1,679
    • Open Access
    Adaptive Subgradient Methods for Online Learning and Stochastic Optimization
    • 5,948
    • Highly Influential
    • Open Access
    Show, Attend and Tell: Neural Image Caption Generation with Visual Attention
    • 5,089
    • Open Access
    Foundations of statistical natural language processing
    • 6,232
    • Open Access