Translating Embeddings for Modeling Multi-relational Data
- Antoine Bordes, Nicolas Usunier, Alberto García-Durán, J. Weston, Oksana Yakhnenko
- Computer ScienceNIPS
- 5 December 2013
TransE is proposed, a method which models relationships by interpreting them as translations operating on the low-dimensional embeddings of the entities, which proves to be powerful since extensive experiments show that TransE significantly outperforms state-of-the-art methods in link prediction on two knowledge bases.
Natural Language Processing (Almost) from Scratch
- Ronan Collobert, J. Weston, L. Bottou, Michael Karlen, K. Kavukcuoglu, P. Kuksa
- Computer ScienceJournal of machine learning research
- 1 February 2011
We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity…
Learning with Local and Global Consistency
- Dengyong Zhou, O. Bousquet, T. N. Lal, J. Weston, B. Schölkopf
- Computer ScienceNIPS
- 9 December 2003
A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intrinsic structure collectively revealed by known labeled and unlabeled points.
Gene Selection for Cancer Classification using Support Vector Machines
- I. Ramadass Subramanian, J. Weston, S. Barnhill, V. Vapnik
- BiologyMachine-mediated learning
- 11 March 2002
This paper proposes a new method of gene selection utilizing Support Vector Machine methods based on Recursive Feature Elimination (RFE), and demonstrates experimentally that the genes selected yield better classification performance and are biologically relevant to cancer.
Fisher discriminant analysis with kernels
- S. Mika, Gunnar Rätsch, J. Weston, B. Scholkopf, K.R. Mullers
- Computer ScienceNeural Networks for Signal Processing IX…
- 23 August 1999
A non-linear classification technique based on Fisher's discriminant which allows the efficient computation of Fisher discriminant in feature space and large scale simulations demonstrate the competitiveness of this approach.
Reading Wikipedia to Answer Open-Domain Questions
- Danqi Chen, Adam Fisch, J. Weston, Antoine Bordes
- Computer ScienceAnnual Meeting of the Association for…
- 31 March 2017
This approach combines a search component based on bigram hashing and TF-IDF matching with a multi-layer recurrent neural network model trained to detect answers in Wikipedia paragraphs, indicating that both modules are highly competitive with respect to existing counterparts.
Curriculum learning
- Yoshua Bengio, J. Louradour, Ronan Collobert, J. Weston
- EducationInternational Conference on Machine Learning
- 14 June 2009
It is hypothesized that curriculum learning has both an effect on the speed of convergence of the training process to a minimum and on the quality of the local minima obtained: curriculum learning can be seen as a particular form of continuation method (a general strategy for global optimization of non-convex functions).
A Neural Attention Model for Abstractive Sentence Summarization
- Alexander M. Rush, S. Chopra, J. Weston
- Computer ScienceConference on Empirical Methods in Natural…
- 2 September 2015
This work proposes a fully data-driven approach to abstractive sentence summarization by utilizing a local attention-based model that generates each word of the summary conditioned on the input sentence.
A unified architecture for natural language processing: deep neural networks with multitask learning
- Ronan Collobert, J. Weston
- Computer ScienceInternational Conference on Machine Learning
- 5 July 2008
We describe a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic…
End-To-End Memory Networks
- Sainbayar Sukhbaatar, Arthur D. Szlam, J. Weston, R. Fergus
- Computer ScienceNIPS
- 31 March 2015
A neural network with a recurrent attention model over a possibly large external memory that is trained end-to-end, and hence requires significantly less supervision during training, making it more generally applicable in realistic settings.
...
...