NeuronBlocks: Building Your NLP DNN Models Like Playing Lego

  title={NeuronBlocks: Building Your NLP DNN Models Like Playing Lego},
  author={Ming Gong and Linjun Shou and Wutao Lin and Zhijie Sang and Quanjia Yan and Ze Yang and Daxin Jiang},
Deep Neural Networks (DNN) have been widely employed in industry to address various Natural Language Processing (NLP) tasks. [] Key Method This toolkit empowers engineers to build, train, and test various NLP models through simple configuration of JSON files. The experiments on several NLP datasets such as GLUE, WikiQA and CoNLL-2003 demonstrate the effectiveness of NeuronBlocks.

Figures and Tables from this paper

AutoNLU: An On-demand Cloud-based Natural Language Understanding System for Enterprises
AutoNLU is introduced, an on-demand cloud-based system with an easy-to-use interface that covers all common use-cases and steps in developing an NLU model and achieves state-of-the-art results on two public benchmarks.
NeuralVis: Visualizing and Interpreting Deep Learning Models
An instance-based visualization tool for DNN, namely NeuralVis, to support software engineers in visualizing and interpreting deep learning models and can assist engineers in identifying critical features that determine the prediction results is presented.
XML2NN: A Unified Modeling Method Accelerated by Distributed Training with an XML File
This paper introduces XML2NN, a unified modeling method for CV field with distributed training functions based on multiple deep learning frameworks, which enables easy model construction and quick cross-framework model sharing.
The Machine Learning Bazaar: Harnessing the ML Ecosystem for Effective System Development
The Machine Learning Bazaar is introduced, a new framework for developing machine learning and automated machine learning software systems that provides solutions to a variety of data modalities and problem types and pair these pipelines with a hierarchy of AutoML strategies - Bayesian optimization and bandit learning.


BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
A new language representation model, BERT, designed to pre-train deep bidirectional representations from unlabeled text by jointly conditioning on both left and right context in all layers, which can be fine-tuned with just one additional output layer to create state-of-the-art models for a wide range of tasks.
Attention is All you Need
A new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely is proposed, which generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
AllenNLP: A Deep Semantic Natural Language Processing Platform
AllenNLP is described, a library for applying deep learning methods to NLP research that addresses issues with easy-to-use command-line tools, declarative configuration-driven experiments, and modular NLP abstractions.
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
A benchmark of nine diverse NLU tasks, an auxiliary dataset for probing models for understanding of specific linguistic phenomena, and an online platform for evaluating and comparing models, which favors models that can represent linguistic knowledge in a way that facilitates sample-efficient learning and effective knowledge-transfer across tasks.
Get To The Point: Summarization with Pointer-Generator Networks
A novel architecture that augments the standard sequence-to-sequence attentional model in two orthogonal ways, using a hybrid pointer-generator network that can copy words from the source text via pointing, which aids accurate reproduction of information, while retaining the ability to produce novel words through the generator.
Named Entity Recognition with Bidirectional LSTM-CNNs
A novel neural network architecture is presented that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering.
Distilling the Knowledge in a Neural Network
This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.
Improving Language Understanding by Generative Pre-Training
The general task-agnostic model outperforms discriminatively trained models that use architectures specifically crafted for each task, improving upon the state of the art in 9 out of the 12 tasks studied.
FusionNet: Fusing via Fully-Aware Attention with Application to Machine Comprehension
This paper introduces a new neural structure called FusionNet, which extends existing attention approaches from three perspectives. First, it puts forward a novel concept of "history of word" to
Bidirectional Attention Flow for Machine Comprehension
The BIDAF network is introduced, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization.