Neural semi-Markov CRF for Monolingual Word Alignment

  title={Neural semi-Markov CRF for Monolingual Word Alignment},
  author={Wuwei Lan and Chao Jiang and Wei Xu},
Monolingual word alignment is important for studying fine-grained editing operations (i.e., deletion, addition, and substitution) in text-to-text generation tasks, such as paraphrase generation, text simplification, neutralizing biased language, etc. In this paper, we present a novel neural semi-Markov CRF alignment model, which unifies word and phrase alignments through variable-length spans. We also create a new benchmark with human annotations that cover four different text genres to… 

Figures and Tables from this paper

Importance of Data and Controllability in Neural Text Simplification
This paper proposes a novel hybrid approach that leverages linguistically-motivated rules for splitting and deletion, and couples them with a neural paraphrasing model to produce varied rewriting styles, and introduces a new data augmentation method to improve the paraphrase capability of the model.
Towards Explainable Evaluation Metrics for Natural Language Generation
This concept paper identifies key properties and proposes key goals of explainable machine translation evaluation metrics and provides a vision of future approaches to explainable evaluation metric and their evaluation.


Neural CRF Model for Sentence Alignment in Text Simplification
A novel neural CRF alignment model is proposed which not only leverages the sequential nature of sentences in parallel documents but also utilizes a neural sentence pair model to capture semantic similarity.
Gappy Phrasal Alignment By Agreement
A principled and efficient phrase-to-phrase alignment model, useful in machine translation as well as other related natural language processing problems, that shows substantial improvements in both alignment quality and translation quality over word-based Hidden Markov Models, while maintaining asymptotically equivalent runtime.
A Discriminative Neural Model for Cross-Lingual Word Alignment
A novel discriminative word alignment model is introduced, which is integrated into a Transformer-based machine translation model and evaluates its performance intrinsically on both English-Chinese and English-Arabic alignment, where it achieves major improvements over unsupervised baselines.
A Supervised Word Alignment Method Based on Cross-Language Span Prediction Using Multilingual BERT
The proposed method significantly outperformed previous supervised and unsupervised word alignment methods without using any bitexts for pretraining and greatly improved the word alignment accuracy by adding the context of the token to the question.
Word Alignment Modeling with Context Dependent Deep Neural Network
A novel bilingual word alignment approach based on DNN (Deep Neural Network) which outperforms the HMM and IBM model 4 baselines by 2 points in F-score and generates a very compact model with much fewer parameters.
Semi-Markov Phrase-Based Monolingual Alignment
We introduce a novel discriminative model for phrase-based monolingual alignment using a semi-Markov CRF. Our model achieves stateof-the-art alignment accuracy on two phrasebased alignment datasets
Back to Basics for Monolingual Alignment: Exploiting Word Similarity and Contextual Evidence
We present a simple, easy-to-replicate monolingual aligner that demonstrates state-of-the-art performance while relying on almost no supervision and a very small number of external resources. Based
Discriminative Word Alignment with Conditional Random Fields
A novel approach for inducing word alignments from sentence aligned data using a Conditional Random Field, a discriminative model, which is estimated on a small supervised training set, and which has efficient training and decoding processes which both find globally optimal solutions.
A Phrase-Based Alignment Model for Natural Language Inference
The MANLI system is presented, a new NLI aligner designed to address the alignment problem, which uses a phrase-based alignment representation, exploits external lexical resources, and capitalizes on a new set of supervised training data.
A Joint Phrasal and Dependency Model for Paraphrase Alignment
A new model for monolingual alignment is presented in which the score of an alignment decomposes over both the set ofaligned phrases as well as a set of aligned dependency arcs.