Discourse Segmentation for Sentence Compression

@inproceedings{MolinaVillegas2011DiscourseSF,
  title={Discourse Segmentation for Sentence Compression},
  author={Alejandro Molina-Villegas and Juan-Manuel Torres-Moreno and Eric SanJuan and Iria da Cunha and Gerardo E Sierra and Patricia Vel{\'a}zquez-Morales},
  booktitle={MICAI},
  year={2011}
}
Earlier studies have raised the possibility of summarizing at the level of the sentence. This simplification should help in adapting textual content in a limited space. Therefore, sentence compression is an important resource for automatic summarization systems. However, there are few studies that consider sentence-level discourse segmentation for compression task; to our knowledge, none in Spanish. In this paper, we study the relationship between discourse segmentation and compression for… Expand
Discursive Sentence Compression
TLDR
Results show that the degree of disagreement in determining the optimal compressed sentence is high and increases with the complexity of the sentence, however, there is some agreement on the decision to delete discourse segments. Expand
Sentence Compression in Spanish driven by Discourse Segmentation and Language Models
TLDR
This work presents a sentence compressions approach guided by level-sentence discourse segmentation and probabilistic language models (LM) that is able to generate coherent summaries with grammatical compressed sentences. Expand
Extending Automatic Discourse Segmentation for Texts in Spanish to Catalan
TLDR
This article presents the first discourse segmenter for texts in Catalan, based on Rhetorical Structure Theory for Spanish, and uses lexical and syntactic information to translate rules valid for Spanish into rules for Catalan. Expand
A Turing Test to Evaluate a Complex Summarization Task
TLDR
A novel imitation game to evaluate Automatic Summarization by Compression ASC using the Turing test and it is shown that a state of the art ASC system can pass such a test and simulate a human summary in 60% of the cases. Expand
On the Effectiveness of using Sentence Compression Models for Query-Focused Multi-Document Summarization
TLDR
Empirical evaluation on the DUC benchmark datasets demonstrates that the overall summary quality can be improved significantly using global optimization with semantically motivated models. Expand
A feature selection approach for automatic e-book classification based on discourse segmentation
TLDR
A novel feature selection approach for automatic text classification of large digital documents – e-books of online library system shows that identifying discourse segments and capturing subtopic features leads to better performance, in comparison with two conventional feature selection techniques. Expand
Compresión automática de frases: un estudio hacia la generación de resúmenes en español
TLDR
A linear model is proposed that predicts the removal of intra-sentence segments with application in summarization and was trained over 60 thousand decisions of remove or preserve a segment considering the whole context and the produced summary. Expand
Complex question answering: minimizing the gaps and beyond
TLDR
This work proposed a supervised approach for automatically learning good decompositions of complex questions in this work and presented an integer linear programming formulation where sentence compression models were applied for the query-focused multi-document summarization task in order to investigate if sentence compression improves the overall performance. Expand
MultiLingMine 2016: Modeling, Learning and Mining for Cross/Multilinguality
TLDR
The 1st International Workshop on Modeling, Learning and Mining for Cross/Multilinguality (dubbed MultiLingMine 2016) provides a venue to discuss research advances in cross-/multilingual related topics, focusing on new multidisciplinary research questions that have not been deeply investigated so far. Expand
Automatic Discourse Segmentation: an evaluation in French
TLDR
Three discursive segmentation models solely based on resources simultaneously available in several languages: marker lists and a statistic POS labeling, and a manually annotated reference against the Annodis corpus are developed. Expand
...
1
2
...

References

SHOWING 1-10 OF 34 REFERENCES
Discourse Chunking and its Application to Sentence Compression
TLDR
This paper introduces discourse chunking (i.e., the identification of intra-sentential nucleus and satellite spans) as an alternative to full-scale discourse parsing as well as exploiting knowledge-lean features and small amounts of discourse annotations. Expand
Learning Recursive Segments for Discourse Parsing
TLDR
This paper presents a simple approach to discourse segmentation that is able to produce nested EDUs and builds on standard multi-class classification techniques combined with a simple repairing heuristic that enforces global coherence. Expand
Methods for Sentence Compression
TLDR
Three papers discussed here take different approaches to identifying important content, determining which sentences are grammatical, and jointly optimizing these objectives, and conclude with ideas for future work in this area. Expand
Summarization beyond sentence extraction: A probabilistic approach to sentence compression
TLDR
This paper focuses on sentence compression, a simpler version of this larger challenge, and aims to achieve two goals simultaneously: the authors' compressions should be grammatical, and they should retain the most important pieces of information. Expand
Sentence Compression for the LSA-based Summarizer
We present a simple sentence compression approach for our summarizer based on latent semantic analysis (LSA). The summarization method assesses each sentence by an LSA score. The compressionExpand
Statistics-Based Summarization - Step One: Sentence Compression
TLDR
This paper focuses on sentence compression, a simpler version of this larger challenge, and aims to achieve two goals simultaneously: the compressions should be grammatical, and they should retain the most important pieces of information. Expand
Discourse Segmentation for Spanish Based on Shallow Parsing
TLDR
DiSeg is presented, the first discourse segmenter for Spanish, which uses the framework of Rhetorical Structure Theory and is based on lexical and syntactic rules, obtaining promising results. Expand
A Syntactic and Lexical-Based Discourse Segmenter
TLDR
This work compares SLSeg to a probabilistic segmenter, showing that a conservative approach increases precision at the expense of recall, while retaining a high F-score across both formal and informal texts. Expand
Automatic Sentence Simplification for Subtitling in Dutch and English
TLDR
Two methods for monolingual sentence length reduction are compared: one based on learning sentence reduction from a parallel corpus and onebased on hand-crafted deletion rules. Expand
Modelling Compression with Discourse Constraints
TLDR
A discourse informed model which is capable of producing document compressions that are coherent and informative is presented, inspired by theories of local coherence and formulated within the framework of Integer Linear Programming. Expand
...
1
2
3
4
...