Considering Performance in the Automated and Manual Coding of Sociolinguistic Variables: Lessons From Variable (ING)

@article{Kendall2021ConsideringPI,
  title={Considering Performance in the Automated and Manual Coding of Sociolinguistic Variables: Lessons From Variable (ING)},
  author={Tyler Kendall and Charlotte Vaughn and Charlie Farrington and Kaylynn Gunter and Jaidan McLean and Chloe Tacata and Shelby Arnson},
  journal={Frontiers in Artificial Intelligence},
  year={2021},
  volume={4}
}
Impressionistic coding of sociolinguistic variables like English (ING), the alternation between pronunciations like talkin' and talking, has been a central part of the analytic workflow in studies of language variation and change for over a half-century. Techniques for automating the measurement and coding for a wide range of sociolinguistic data have been on the rise over recent decades but procedures for coding some features, especially those without clearly defined acoustic correlates like… Expand

Figures and Tables from this paper

References

SHOWING 1-10 OF 91 REFERENCES
Automatic Detection of Sociolinguistic Variation Using Forced Alignment
TLDR
This study expands the functionality of FAVE-align to fully automate the coding of three sociolinguistic variables in British English: (th)-fronting, (td)-deletion, and (h)-dropping. Expand
From categories to gradience: Auto-coding sociophonetic variation with random forests
TLDR
A machine learning method is applied to automate coding of two English sociophonetic variables traditionally treated as categorical, non-prevocalic /r/ and word-medial interv vocalic /t/, based on tokens’ acoustic signatures that represents a composite of the acoustic cues fed into the model. Expand
Listener sensitivity to probabilistic conditioning of sociolinguistic variables: The case of (ING)
Abstract This paper investigates the extent to which listeners are cued into the systematicity of variability in speech, particularly the grammatical conditioning constraints of the EnglishExpand
Perceptual coding reliability of (L)-vocalization in casual speech data
Abstract (L)-vocalization has been receiving increasing attention in sociophonetic research but is a challenging variable to measure consistently. Acoustic measures are not typically used becauseExpand
Automatic detection of “g-dropping” in American English using forced alignment
TLDR
This study investigated the use of forced alignment for automatic detection of “g-dropping” in American English (e.g., walkin'). Two acoustic models were trained, one for -in' and the other for -ing, and it was found that native Mandarin speakers performed poorly on classification of ”g- dropping”. Expand
Toward completely automated vowel extraction: Introducing DARLA
TLDR
A fully automated program called DARLA is introduced, which automatically generates transcriptions with ASR and extracts vowels using FAVE and is tested on a dataset of the US Southern Shift and compares the results with semi-automated methods. Expand
(ING): A VERNACULAR BASELINE FOR ENGLISH IN APPALACHIA
In order to provide a contemporary description of Appalachian English, this article investigates the (ING) variable in Appalachian speech, explaining both the linguistic and social constraints onExpand
Acoustic reduction in conversational Dutch: A quantitative analysis based on automatically generated segmental transcriptions
TLDR
Overall, it is found that reduction is more pervasive in spontaneous Dutch than previously documented. Expand
Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se
TLDR
Several parametric representations of the acoustic signal were compared with regard to word recognition performance in a syllable-oriented continuous speech recognition system and the superior performance of the mel-frequency cepstrum coefficients may be attributed to the fact that they better represent the perceptually relevant aspects of the short-term speech spectrum. Expand
Phonetic Transcriptions of Large Speech Corpora
TLDR
It can be concluded, that good quality broad phonetic transcription for read speech can be obtained fully automatically by using relatively simple techniques, by omitting human-made transcriptions of read speech. Expand
...
1
2
3
4
5
...