Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts

@article{Salton1994AutomaticAT,
  title={Automatic Analysis, Theme Generation, and Summarization of Machine-Readable Texts},
  author={Gerard Salton and James Allan and Chris Buckley and Amit Singhal},
  journal={Science},
  year={1994},
  volume={264},
  pages={1421 - 1426}
}
Vast amounts of text material are now available in machine-readable form for automatic processing. Here, approaches are outlined for manipulating and accessing texts in arbitrary subject areas in accordance with user needs. In particular, methods are given for determining text themes, traversing texts selectively, and extracting summary statements that reflect text content. 
Automatic Text Decomposition and Structuring
A Framework for Text Processing and Supporting Access to Collections of Digitized Historical Newspapers
TLDR
A framework for processing the OCRd text to identify articles and extract metadata for them is described and visualization and summarization techniques that can be used to present the extracted events are described.
Generating titles for paragraphs using statistically extracted keywords and phrases
  • D. Gokcay, E. Gokcay
  • Computer Science
    1995 IEEE International Conference on Systems, Man and Cybernetics. Intelligent Systems for the 21st Century
  • 1995
TLDR
A prototype statistical title generation system is developed and tested on social science abstracts and reading comprehension exercises and the statistical methods used in automatic indexing for the extraction of keywords is adopted.
A Robust Practical Text Summarization
TLDR
The SummarizerTool is described, a Java-implemented prototype, and its applications in various document processing tasks, including news reports, government documents, and even court records.
Automatic text decomposition using text segments and text themes
TLDR
The interaction between text segments and text themes is used to characterize text structure, and to formulate specifications for information retrieval, text traversal, and text summarization.
Automated Text Summarization in SUMMARIST
TLDR
The system’s architecture is described and details of some of its modules, many of them trained on large corpora of text, are provided.
Experiences with and Reflections on Text Summarization Tools
  • Shuhua Liu
  • Computer Science
    Int. J. Comput. Intell. Syst.
  • 2009
TLDR
The experience with applying extractive summarization techniques to process news articles, economic reports and nursing narratives is reported and analysis of the effect of different summarization methods and parameters on the summarization results are presented.
Machine Learning of Generic and User-Focused Summarization
TLDR
The use of machine learning is described on a training corpus of documents and their abstracts to discover salience functions which describe what combination of features is optimal for a given summarization task.
A Text-Extraction Based Summarizer
TLDR
An automated summarizer that can generate both short indicative abstracts, useful for quick scanning of a list of documents, as well as longer informative digests that can serve as surrogates for the full text.
...
...

References

SHOWING 1-10 OF 44 REFERENCES
Automatic Structuring of Text Files
TLDR
Methods are described in this study for the automatic structuring of heterogeneous text collections, and the construction of browsing tools and access procedures that facilitate collection use.
Global Text Matching for Information Retrieval
TLDR
An approach is outlined for the retrieval of natural language texts in response to available search requests and for the recognition of content similarities between text excerpts that appears to outperform other currently available methods.
Automatic text structuring and retrieval-experiments in automatic encyclopedia searching
TLDR
An alternative text manipulation system is outlined useful for the retrieval of large heterogeneous texts, and for the recognition of content similarities between text excerpts, based on flexible text matching procedures carried out in several contexts of different scope.
Automatic abstracting and indexing—survey and recommendations
TLDR
The relative-frequency approach to measuring the significance of words, word groups, and sentences is discussed in detail, as is its application to problems of automatic indexing and automatic abstracting.
The Automatic Creation of Literature Abstracts
TLDR
In the exploratory research described, the complete text of an article in machine-readable form is scanned by an IBM 704 data-processing machine and analyzed in accordance with a standard program.
Developments in Automatic Text Retrieval
TLDR
The text analysis problem is examined, and modern approaches leading to the identification and retrieval of selected text items in response to search requests are discussed.
Approaches to passage retrieval in full text information systems
TLDR
New approaches are described in this study for implementing selective passage retrieval systems, and identifying text passages responsive to particular user needs.
Russian Experience in Hypertext: Automatic Compiling of Coherent Texts
TLDR
The Russian hypertext systems, HYPERLOG,HYPERNET, BAHYS, and SEMPRO, are described, which appear specific problems of logic and structural analysis which were first advanced by Russian researchers.
Subtopic structuring for full-length document access
TLDR
It is argued that the advent of large volumes of full-length text, as opposed to short texts like abstracts and newswire, should be accompanied by corresponding new approaches to information access and a partition of the text into coherent multi-paragraph units that represent the pattern of subtopics that comprise the text.
...
...