Syntree2Vec - An algorithm to augment syntactic hierarchy into word embeddings
@article{Bhardwaj2018Syntree2VecA, title={Syntree2Vec - An algorithm to augment syntactic hierarchy into word embeddings}, author={Shubham Bhardwaj}, journal={ArXiv}, year={2018}, volume={abs/1808.05907} }
Word embeddings aims to map sense of the words into a lower dimensional vector space in order to reason over them. Training embeddings on domain specific data helps express concepts more relevant to their use case but comes at a cost of accuracy when data is less. Our effort is to minimise this by infusing syntactic knowledge into the embeddings. We propose a graph based embedding algorithm inspired from node2vec. Experimental results have shown that our algorithm improves the syntactic…Â
One Citation
An Unsupervised Approach to Structuring and Analyzing Repetitive Semantic Structures in Free Text of Electronic Medical Records
- Computer ScienceJournal of personalized medicine
- 2022
This work presents an unsupervised approach to medical data annotation and shows on a validation dataset that the proposed labeling method generates meaningful labels correctly for 92.7% of groups.
References
SHOWING 1-10 OF 14 REFERENCES
Modeling Order in Neural Word Embeddings at Scale
- Computer ScienceICML
- 2015
A new neural language model incorporating both word order and character order in its embedding is proposed, which produces several vector spaces with meaningful substructure, as evidenced by its performance on a recent word-analogy task.
How to evaluate word embeddings? On importance of data efficiency and simple supervised tasks
- Computer ScienceArXiv
- 2017
It is proposed that evaluation of word representation evaluation should focus on data efficiency and simple supervised tasks, where the amount of available data is varied and scores of a supervised model are reported for each subset (as commonly done in transfer learning).
Two/Too Simple Adaptations of Word2Vec for Syntax Problems
- Computer ScienceNAACL
- 2015
We present two simple modifications to the models in the popular Word2Vec tool, in order to generate embeddings more suited to tasks involving syntax. The main issue with the original models is the…
Evaluation methods for unsupervised word embeddings
- Computer ScienceEMNLP
- 2015
A comprehensive study of evaluation methods for unsupervised embedding techniques that obtain meaningful representations of words from text, calling into question the common assumption that there is one single optimal vector representation.
SyntaxNet Models for the CoNLL 2017 Shared Task
- Computer ScienceArXiv
- 2017
A baseline dependency parsing system for the CoNLL2017 Shared Task, which is called "ParseySaurus," which uses the DRAGNN framework to combine transition-based recurrent parsing and tagging with character-based word representations.
Distributed Representations of Words and Phrases and their Compositionality
- Computer ScienceNIPS
- 2013
This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
node2vec: Scalable Feature Learning for Networks
- Computer ScienceKDD
- 2016
In node2vec, an algorithmic framework for learning continuous feature representations for nodes in networks, a flexible notion of a node's network neighborhood is defined and a biased random walk procedure is designed, which efficiently explores diverse neighborhoods.
A Neural Probabilistic Language Model
- Computer ScienceJ. Mach. Learn. Res.
- 2000
This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences.
Evaluating Generative Models for Text Generation
- Computer Science
- 2017
Here it is hoped to extend the evaluation presented for the SeqGAN model in Yu et al. (2016) using two additional datasets and an additional perplexity evaluation metric.
Character-level Convolutional Networks for Text Classification
- Computer ScienceNIPS
- 2015
This article constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results in text classification.