Jonathan Kilgour

Learn More
Multimodal corpora that show humans interacting via language are now relatively easy to collect. Current tools allow one either to apply sets of time-stamped codes to the data and consider their timing and sequencing or to describe some specific linguistic structure that is present in the data, built over the top of some form of transcription. To further(More)
The NITE Object Model Library is an implemented set of routines for loading, accessing, manipulating, and serializing linguistic data. It is similar in spirit to the data handling provided by the Annotation Graph Toolkit, but is aimed at data that is heavily cross-annotated with structured information, and thus chooses higher expressivity at the cost of(More)
The NITE XML Toolkit (NXT) is open source software for working with language corpora, with particular strengths for multimodal and heavily cross-annotated data sets. In NXT, annotations are described by types and attribute value pairs, and can relate to signal via start and end times, to representations of the external environment, and to each other via(More)
The AMIDA Automatic Content Linking Device (ACLD) is a just-in-time document retrieval system for meeting environments. The ACLD listens to a meeting and displays information about the documents from the group’s history that are most relevant to what is being said. Participants can view an outline or the entire content of the documents, if they feel that(More)
The NITE XML Toolkit (NXT) provides library support for working with multimodal language corpora. We describe work in progress to explore its potential for the AMI project by applying it to the ICSI Meeting Corpus. We discuss converting existing data into the NXT data format; using NXT’s query facility to explore the corpus; hand-annotation and automatic(More)
This paper describes the Multi-Genre Broadcast (MGB) Challenge at ASRU 2015, an evaluation focused on speech recognition, speaker diarization, and “lightly supervised” alignment of BBC TV recordings. The challenge training data covered the whole range of seven weeks BBC TV output across four channels, resulting in about 1,600 hours of broadcast audio. In(More)
It has recently become possible to record any small meeting using a laptop equipped with a plug-and-play USB microphone array. We show the potential for such recordings in a personal aid that allows project managers to record their meetings and, when reviewing them afterwards through a standard calendar interface, to find relevant documents on their(More)
In this work we describe a large-scale extrinsic evaluation of automatic speech summarization technologies for meeting speech. The particular task is a decision audit, wherein a user must satisfy a complex information need, navigating several meetings in order to gain an understanding of how and why a given decision was made. We compare the usefulness of(More)
BACKGROUND Maaori are the Indigenous people of New Zealand and do not enjoy the same oral health status as the non-Indigenous majority. To overcome oral health disparities, the life course approach affords a valid foundation on which to develop a process that will contribute to the protection of the oral health of young infants. The key to this process is(More)