Learn More
This paper describes the Buckeye corpus of spontaneous American English speech, a 307,000-word corpus containing the speech of 40 talkers from central Ohio, USA. The method used to elicit and record the speech is described, followed by a description of the protocol that was developed to phonemically label what talkers said. The results of a test of labeling(More)
The causes of pronunciation reduction in 8458 occurrences of ten frequent English function words in a four-hour sample from conversations from the Switchboard corpus were examined. Using ordinary linear and logistic regression models, we examined the length of the words, the form of their vowel (basic, full, or reduced), and final obstruent deletion. For(More)
We investigate how the probability of a word affects its pronunciation. We examined 5618 tokens of the 10 most frequent (function) words in Switchboard: I, and, the, that, a, you, to, of, it, and in, and 2042 tokens of content words whose lexical form ends in a t or d. Our observations were drawn from the phonetically hand-transcribed subset [1] of the(More)
We present a preliminary analysis of transcriber consistency in labeling and segmentation of words and phones in the Buckeye corpus of spontaneous, informal speech. We find that pairwise inter-transcriber agreement on exact phone label match was 76%, and segmentation agreement within 20% of phone pair length was 75%, though longer phones are more(More)
0 Introduction Word frequency and word predictability have both been proposed in the literature as explanations for word shortening or reduction. Traditionally, these two explanations have been modeled separately. Frequency models focus on the fact that words with high use frequency are shortened compared to low frequency words, whether in the lexicon (Zipf(More)
Two experiments examined 3 variables affecting accuracy, response time, and reports of strategy use in a binary classification skill task. In Experiment 1, higher rule cue salience, allowing faster rule application, produced higher aggregate rule use than lower rule cue salience. After participants were pretrained on the relevant classification rule, rule(More)
Using a corpus of Medieval Spanish text, we examine factors affecting the Modern Standard Spanish outcome of the initial /f/ in Latin FV-words. Regression analyses reveal that the frequency of a word's use in extralexical phonetic reducing environments and lexical stress patterns significantly predict the modern distribution of f-([f]) and h-(Ø) in the(More)
Two experiments examined English speakers' choices of count or mass compatible frames for nouns varying in imageability (concrete, abstract) and noun class (count, mass). Pairing preferences with equative (much/many) and non-equative (less/fewer) constructions were compared for groups of teenagers, young adults, and older adults. Deviations from normative(More)
  • 1