Corpus ID: 53303569

Discourse in Multimedia: A Case Study in Information Extraction

  title={Discourse in Multimedia: A Case Study in Information Extraction},
  author={Mrinmaya Sachan and Kumar Avinava Dubey and Eduard H. Hovy and Tom Michael Mitchell and Dan Roth and Eric P. Xing},
To ensure readability, text is often written and presented with due formatting. These text formatting devices help the writer to effectively convey the narrative. At the same time, these help the readers pick up the structure of the discourse and comprehend the conveyed information. There have been a number of linguistic theories on discourse structure of text. However, these theories only consider unformatted text. Multimedia text contains rich formatting features which can be leveraged for… Expand
Towards Literate Artificial Intelligence
Standardized tests are used to test students as they progress in the formal education system. These tests are readily available and have clear evaluation procedures.Hence, it has been proposed thatExpand


Integrating Text Formatting and Text Generation
This paper presents a model for representing the architecture of documents for natural language generation from a specialized sublanguage (in the sense of Z. Harris) of natural language, and proposes a brief survey of a system of formatted text generation, based on this model. Expand
Pattern Matching and Discourse Processing in Information Extraction from Japanese Text
A Japanese information extraction system that merges information using a pattern matcher and discourse processor that approaches human performance is reported on. Expand
Discourse indicators for content selection in summarization
The results establish the usefulness of discourse features and find that lexical overlap provides a simple and cheap alternative to discourse for computing text structure with comparable performance for the task of content selection. Expand
Automatic Generation of Formatted Text
This work describes how work on the automated planning of multisentence text and on the display of information in a multimedia system led to the insight that text formatting devices such as footnotes, italicized regions, enumerations, etc., can be planned automatically by a text structure planning process. Expand
Rhetorical relations for information retrieval
A language model modification is presented that considers rhetorical relations when estimating the relevance of a document to a query and shows that certain rhetorical relations can benefit retrieval effectiveness notably. Expand
Discourse segmentation in aid of document summarization
  • B. Boguraev, Mary S. Neff
  • Computer Science
  • Proceedings of the 33rd Annual Hawaii International Conference on System Sciences
  • 2000
Evaluated against the corpus used in the development of the baseline summarizer, summaries derived either by means of segmentation analysis alone, or by a mix of strategies for combining salience calculation and topic shift detection, are shown to be of comparable, and under certain conditions even better quality. Expand
PDFMEF: A Multi-Entity Knowledge Extraction Framework for Scholarly Documents and Semantic Search
We introduce PDFMEF, a multi-entity knowledge extraction framework for scholarly documents in the PDF format. It is implemented with a framework that encapsulates open-source extraction tools.Expand
Discourse processing for context question answering based on linguistic knowledge
Three models driven by Centering Theory for discourse processing are examined: a reference model that resolves pronoun references for each question, a forward model that makes use of the forward looking centers from previous questions, and a transition model that takes into account the transition state between adjacent questions. Expand
Recalling and Summarizing Complex Discourse
In this paper we investigate the properties of complex semantic information processing involved in the comprehension, (re-)production and summarizing of longer narrative discourse. In the theoreticalExpand
Towards Constructive Text, Diagram, and Layout Generation for Information Presentation
It is demonstrated that layout offers a rich resource for achieving presentational coherence, alongside more traditional resources such as text-formatting and the text-internal marking of discourse connections, and an integrated approach to layout, text, and diagram generation is introduced. Expand