Joshua Crowgey

  • Citations Per Year
Learn More
We propose to bring together two kinds of linguistic resources—interlinear glossed text (IGT) and a language-independent precision grammar resource—to automatically create precision grammars in the context of language documentation. This paper takes the first steps in that direction by extracting major-constituent word order and case system properties from(More)
This paper presents Xigt, an extensible storage format for interlinear glossed text (IGT). We review design desiderata for such a format based on our own use cases as well as general best practices, and then explore existing representations of IGT through the lens of those desiderata. We give an overview of the data model and XML serialization of Xigt, and(More)
We present a case study of the methodology of using information extracted from interlinear glossed text (IGT) to create of actual working HPSG grammar fragments using the Grammar Matrix focusing on one language: Chintang. Though the results are barely measurable in terms of coverage over running text, they nonetheless provide a proof of concept. Our(More)
The majority of the world’s languages have little to no NLP resources or tools. This is due to a lack of training data (“resources”) over which tools, such as taggers or parsers, can be trained. In recent years, there have been increasing efforts to apply NLP methods to a much broader swath of the world’s languages. In many cases this involves bootstrapping(More)
This paper hypothesizes that transfer-based machine translation systems can be improved by encoding information structure in both the source and target grammars, and preserving information structure in the transfer stage. We explore how information structure can be represented within the HPSG/MRS formalism (Pollard and Sag, 1994; Copestake et al., 2005) and(More)
While there have been significant improvements in speech and language processing, it remains difficult to bring these new tools to bear on challenges in endangered language documentation. We describe an effort to bridge this gap through Shared Task Evaluation Campaigns (STECs) by designing tasks that are compelling to speech and natural language processing(More)
In this paper I explore the logical range of sentential negation types predicted by the theory of HPSG. I find that typological surveys confirm that attested simple negation strategies neatly line up with the types of lexical material given by assuming Lexical Integrity and standard Phrase Structure Grammar dependencies. I then extend the methodology to(More)
  • 1