• Corpus ID: 237454642

ELIT: Emory Language and Information Toolkit

@article{He2021ELITEL,
  title={ELIT: Emory Language and Information Toolkit},
  author={Han He and Liyan Xu and Jinho D. Choi},
  journal={ArXiv},
  year={2021},
  volume={abs/2109.03903}
}
We introduce ELIT, the Emory Language and Information Toolkit, which is a comprehensive NLP framework providing transformer-based end-to-end models for core tasks with a special focus on memory efficiency while maintaining state-of-the-art accuracy and speed. Compared to existing toolkits, ELIT features an efficient Multi-Task Learning (MTL) model with many downstream tasks that include lemmatization, part-of-speech tagging, named entity recognition, dependency parsing, constituency parsing… 

Figures and Tables from this paper

Online Coreference Resolution for Dialogue Processing: Improving Mention-Linking on Real-Time Conversations
This paper suggests a direction of coreference resolution for online decoding on actively generated input such as dialogue, where the model accepts an utterance and its past context, then finds

References

SHOWING 1-10 OF 39 REFERENCES
Tokenizing, POS Tagging, Lemmatizing and Parsing UD 2.0 with UDPipe
We present an update to UDPipe 1.0 (Straka et al., 2016), a trainable pipeline which performs sentence segmentation, tokenization, POS tagging, lemmatization and dependency parsing. We provide
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
TLDR
This work introduces Stanza, an open-source Python natural language processing toolkit supporting 66 human languages that features a language-agnostic fully neural pipeline for text analysis, including tokenization, multi-word token expansion, lemmatization, part-of-speech and morphological feature tagging, dependency parsing, and named entity recognition.
FLAIR: An Easy-to-Use Framework for State-of-the-Art NLP
TLDR
The core idea of the FLAIR framework is to present a simple, unified interface for conceptually very different types of word and document embeddings, which effectively hides all embedding-specific engineering complexity and allows researchers to “mix and match” variousembeddings with little effort.
75 Languages, 1 Model: Parsing Universal Dependencies Universally
TLDR
It is found that fine-tuning a multilingual BERT self-attention model pretrained on 104 languages can meet or exceed state-of-the-art UPOS, UFeats, Lemmas, (and especially) UAS, and LAS scores, without requiring any recurrent or language-specific components.
Fast and Accurate Neural CRF Constituency Parsing
TLDR
A fast and accurate neural CRF constituency parser to batchify the inside algorithm for loss computation by direct large tensor operations on GPU, and meanwhile avoid the outside algorithm for gradient computation via efficient back-propagation.
Deep Biaffine Attention for Neural Dependency Parsing
TLDR
This paper uses a larger but more thoroughly regularized parser than other recent BiLSTM-based approaches, with biaffine classifiers to predict arcs and labels, and shows which hyperparameter choices had a significant effect on parsing accuracy, allowing it to achieve large gains over other graph-based approach.
AllenNLP: A Deep Semantic Natural Language Processing Platform
TLDR
AllenNLP is described, a library for applying deep learning methods to NLP research that addresses issues with easy-to-use command-line tools, declarative configuration-driven experiments, and modular NLP abstractions.
Named Entity Recognition as Dependency Parsing
TLDR
Ideas from graph-based dependency parsing are used to provide the model a global view on the input via a biaffine model and show that the model works well for both nested and flat NER, through evaluation on 8 corpora and achieving SoTA performance on all of them.
A Unified Generative Framework for Various NER Subtasks
TLDR
This work proposes to formulate the NER subtasks as an entity span sequence generation task, which can be solved by a unified sequence-to-sequence (Seq2Seq) framework, and can leverage the pre-trained Seq1Seq model to solve all three kinds of N ER subtasks without the special design of the tagging schema or ways to enumerate spans.
Dependency or Span, End-to-End Uniform Semantic Role Labeling
TLDR
This paper presents an end-to-end model for both dependency and span SRL with a unified argument representation to deal with two different types of argument annotations in a uniform fashion and jointly predict all predicates and arguments.
...
...