DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion

@article{Geva2019DiscoFuseAL,
  title={DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion},
  author={Mor Geva and Eric Malmi and Idan Szpektor and Jonathan Berant},
  journal={ArXiv},
  year={2019},
  volume={abs/1902.10526}
}
Sentence fusion is the task of joining several independent sentences into a single coherent text. Current datasets for sentence fusion are small and insufficient for training modern neural models. In this paper, we propose a method for automatically-generating fusion examples from raw text and present DISCOFUSE, a large scale dataset for discourse-based sentence fusion. We author a set of rules for identifying a diverse set of discourse phenomena in raw text, and decomposing the text into two… CONTINUE READING

References

Publications referenced by this paper.
SHOWING 1-10 OF 27 REFERENCES

Similar Papers

Loading similar papers…