WordsEye: an automatic text-to-scene conversion system

@article{Coyne2001WordsEyeAA,
  title={WordsEye: an automatic text-to-scene conversion system},
  author={Robert Coyne and Richard Sproat},
  journal={Proceedings of the 28th annual conference on Computer graphics and interactive techniques},
  year={2001}
}
  • Robert Coyne, R. Sproat
  • Published 1 August 2001
  • Computer Science
  • Proceedings of the 28th annual conference on Computer graphics and interactive techniques
Natural language is an easy and effective medium for describing visual ideas and mental images. Thus, we foresee the emergence of language-based 3D scene generation systems to let ordinary users quickly create 3D scenes without having to learn special software, acquire artistic skills, or even touch a desktop window-oriented interface. WordsEye is such a system for automatically converting text into representative 3D scenes. WordsEye relies on a large database of 3D models and poses to depict… 

Frame Semantics in Text-to-Scene Generation

TLDR
The WordsEye system has been used by several thousand users on the web to create over 10,000 scenes and is described how the current version of the system incorporates the type of lexical and real-world knowledge needed to depict scenes from language.

AVDT - Automatic Visualization of Descriptive Texts

TLDR
This paper presents a framework, which automatically converts an arbitrary descriptive text into a representative 3D scene, which parses a user-written input text, extracts information using techniques from Natural Language Processing (NLP) and identifies relevant units.

Spatial Relations in Text-to-Scene Conversion

TLDR
This work describes how the WordsEye system incorporates geometric and semantic knowledge about objects and their parts and the spatial relations that hold among these in order to depict spatial relations in 3D scenes.

StorVi (Story Visualization): A Text-to-Image Conversion

TLDR
This research presented by the researchers, StorVi (story visualization): a text-to-image conversion, is a system that can visualize stories of multiple framing in pictures that focus on fable stories for children ages 4-7 yrs.

From visual semantic parameterization to graphic visualization

TLDR
A prototype system called 3DSV (3D Story Visualiser) is presented that generates a virtual scene by using simplified story-based descriptions and the methodology used to parameterize the visual and describable words into XML formatted data structure is described.

From text to images through meanings

TLDR
An active memory is proposed, which is based on object oriented programming technique, to construct the mental world or active database and each word in the active database acts as an agent compared to just a graphic based annotation, which can be found in existing systems.

Annotation Tools and Knowledge Representation for a Text-To-Scene System

TLDR
The set of primitive graphical frames and the functional properties of 3D objects (affordances) the authors use in this decomposition are described and the methods and tools developed to populate VigNet with a large number of action and location vignettes are examined.

Real-time automatic 3D scene generation from natural language voice and text descriptions

TLDR
This paper presents a newly developed system that generates 3D scenes from voice and text natural language input that supports different quality polygon models such as those widely available on the Internet.

Evaluating a text-to-scene generation system as an aid to literacy

TLDR
Classroom experiments using WordsEye, a system for automatically generating 3D scenes from English textual descriptions, are discussed, in which students using the system had significantly greater improvement in their literary character and story descriptions in preand posttest essays compared with a control.

Semantic Parsing for Text to 3D Scene Generation

TLDR
A system that leverages user interaction with 3D scenes to generate training data for semantic parsing approaches, and presents a prototype system that incorporates simple spatial knowledge, and parses natural text to a semantic representation.
...

References

SHOWING 1-10 OF 42 REFERENCES

Put: language-based interactive manipulation of objects

Our approach to scene generation capitalizes the expressive power of natural language by separating its aptness in specifying spatial relations from the difficulties of understanding text. We are

Understanding natural language

TLDR
A computer system for understanding English that contains a parser, a recognition grammar of English, programs for semantic analysis, and a general problem solving system based on the belief that in modeling language understanding, it must deal in an integrated way with all of the aspects of language—syntax, semantics, and inference.

Cognitive modeling: knowledge, reasoning and planning for intelligent characters

TLDR
Cognitive modeling applications in advanced character animation and automated cinematography are demonstrated, allowing behaviors to be specified more naturally and intuitively, more succinctly and at a much higher level of abstraction than would otherwise be possible.

WordNet : an electronic lexical database

TLDR
The lexical database: nouns in WordNet, Katherine J. Miller a semantic network of English verbs, and applications of WordNet: building semantic concordances are presented.

NAtural Language driven Image Generation

TLDR
The experience made through the development of a NAtural Language driven Image Generation is discussed and a theory for equilibrium and support will be outlined together with the problem of object positioning.

Head-Driven Statistical Models for Natural Language Parsing

  • M. Collins
  • Computer Science
    Computational Linguistics
  • 2003
TLDR
Three statistical models for natural language parsing are described, leading to approaches in which a parse tree is represented as the sequence of decisions corresponding to a head-centered, top-down derivation of the tree.

English Verb Classes and Alternations: A Preliminary Investigation

TLDR
Beth Levin shows how identifying verbs with similar syntactic behavior provides an effective means of distinguishing semantically coherent verb classes, and isolates these classes by examining verb behavior with respect to a wide range of syntactic alternations that reflect verb meaning.

A Parameterized Action Representation for Virtual Human Agents

TLDR
A Parameterized Action Representation designed to bridge the gap between natural language instructions and the virtual agents who are to carry them out and a real-time execution architecture controlling 3D animated virtual human avatars is described.

A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text

TLDR
A program that tags each word in an input sentence with the most likely part of speech has been written and performance is encouraging; a 400-word sample is presented and is judged to be 99.5% correct.

Knowledge-Lean Coreference Resolution and its Relation to Textual Cohesion and Coherence

TLDR
COCKTAIL is described, a highperformance coreference resolution system that operatas on a mixture of heuristics that combine semantic and discourse information that shows that referential cohesion can be integrated with lexical cohesion to produce pragmatic knowledge.