Learn More
A spoken language generation system has been developed that learns to describe objects in computer-generated visual scenes. The system is trained by a 'show-and-tellÕ procedure in which visual scenes are paired with natural language descriptions. Learning algorithms acquire prob-abilistic structures which encode the visual semantics of phrase structure,(More)
centered on a large database, but in this case it is entirely of living organisms, the marine bivalves. Over 28,000 records of bivalve gen-era and subgenera from 322 locations around the world have now been compiled by these authors, giving a global record of some 854 genera and subgenera and 5132 species. No fossils are included in the database, but(More)
We report on an audio retrieval system which lets Internet users efficiently access a large audio database containing recordings of the proceedings of the United States House of Representatives. The audio has been temporally aligned to text transcripts of the proceedings (which are manually generated by the U.S. Government) using a novel method based on(More)
Social interaction will be key to enabling robots and machines in general to learn new tasks from ordinary people (not experts in robotics or machine learning). Everyday people who need to teach their machines new things will find it natural for to rely on their interpersonal interaction skills. This thesis provides several contributions towards the(More)
The NewsComm system delivers personalized news and other program material as audio to mobile users through a hand-held playback device. This paper focuses on the iterative design and user testing of the hand-held interface. The interface was first designed and tested in a software-only environment and then ported to a custom hardware platform. The hand-held(More)
As a step toward simulating dynamic dialogue between agents and humans in virtual environments, we describe learning a model of social behavior composed of interleaved utterances and physical actions. In our model, utterances are abstracted as {speech act, propositional content, referent} triples. After training a classifier on 100 gameplay logs from The(More)
We consider wearable computing applications which rely on audio as a primary medium of the interface. This paper surveys a range of interaction techniques which may be applied to the design of wearable audio computers (WACs). A summary of several speech and audio processing technologies which can be used in the interface of WACs are reviewed. We present(More)
Nomadic Radio provides an audio-only wearable interface to unify remote information services such as email, voice mail, hourly news broadcasts, and personal calendar events. These messages are automatically downloaded to a wearable device throughout the day and users can browse them using speech recognition and tactile input. To provide an unobtrusive(More)