Share This Author
Introduction to information retrieval
This textbook teaches classical and web information retrieval, including web search and the related areas of text classification and text clustering from basic concepts, making it perfect for introductory courses in information retrieval for advanced undergraduates and graduate students in computer science.
Foundations of statistical natural language processing
This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear and provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations.
Automatic Word Sense Discrimination
- Hinrich Schütze
- Computer ScienceComput. Linguistics
- 1 March 1998
This paper presents context-group discrimination, a disambiguation algorithm based on clustering that demonstrates good performance of context- group discrimination for a sample of natural and artificial ambiguous words.
ABCNN: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs
This work presents a general Attention Based Convolutional Neural Network (ABCNN) for modeling a pair of sentences and proposes three attention schemes that integrate mutual influence between sentences into CNNs; thus, the representation of each sentence takes into consideration its counterpart.
Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference
This work introduces Pattern-Exploiting Training (PET), a semi-supervised training procedure that reformulates input examples as cloze-style phrases to help language models understand a given task.
Efficient Higher-Order CRFs for Morphological Tagging
This work presents an approximated conditional random field using coarse-to-fine decoding and early updating that yields fast and accurate morphological taggers across six languages with different morphological properties and that across languages higher-order models give significant improvements over 1- order models.
Comparative Study of CNN and RNN for Natural Language Processing
This work is the first systematic comparison of CNN and RNN on a wide range of representative NLP tasks, aiming to give basic guidance for DNN selection.
AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes
This work presents AutoExtend, a system to learn embeddings for synsets and lexemes that achieves state-of-the-art performance on word similarity and word sense disambiguation tasks.
It’s Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
This work shows that performance similar to GPT-3 can be obtained with language models that are much “greener” in that their parameter count is several orders of magnitude smaller, and identifies key factors required for successful natural language understanding with small language models.
Automatic Detection of Text Genre
A theory of genres as bundles of facets, which correlate with various surface cues, are proposed, and it is argued that genre detection based on surface cues is as successful as Detection based on deeper structural properties.