Corpus ID: 236635280

Fine-Grained Classroom Activity Detection from Audio with Neural Networks

  title={Fine-Grained Classroom Activity Detection from Audio with Neural Networks},
  author={Eric Slyman and Chris Daw and Morgan Skrabut and Ana Usenko and Brian Hutchinson},
Instructors are increasingly incorporating student-centered learning techniques in their classrooms to improve learning outcomes. In addition to lecture, these class sessions involve forms of individual and group work, and greater rates of student-instructor interaction. Quantifying classroom activity is a key element of accelerating the evaluation and refinement of innovative teaching practices, but manual annotation does not scale. In this manuscript, we present advances to the young… Expand

Figures and Tables from this paper


Deep Learning for Classroom Activity Detection from Audio
This work introduces a set of deep learning classifiers for automatic activity annotation, evaluating them on a collection of classroom recordings, and shows that their estimates of how much classroom time spent per task are better correlated with actual time spent than existing systems. Expand
Siamese Neural Networks for Class Activity Detection
This work proposes a Siamese neural framework to automatically identify teacher and student utterances from classroom recordings and demonstrates that the approach is superior on the prediction tasks for both online and offline classroom environments. Expand
Multimodal Learning for Classroom Activity Detection
  • Hang Li, Yunxing Kang, +4 authors Zitao Liu
  • Computer Science, Engineering
  • ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2020
The experimental results demonstrate the benefits of the approach on learning attention based neural network from classroom data with different modalities, and show the approach is able to outperform state-of-the-art baselines in terms of various evaluation metrics. Expand
Automatic Teacher Modeling from Live Classroom Audio
Automatic analysis of teachers' instructional strategies from audio recordings collected in live classrooms is investigated and key findings in the context of teacher modeling for formative assessment and professional development are discussed. Expand
Neural Multi-task Learning for Teacher Question Detection in Online Classrooms
An end-to-end neural framework that automatically detects questions from teachers’ audio recordings and strengthens the understanding of semantic relations among different types of questions by incorporating multi-task learning techniques. Expand
Words matter: automatic detection of teacher questions in live classroom discourse using linguistics, acoustics, and context
A comparison of the three feature sets indicates that a model using linguistic features outperforms those using acoustic-prosodic and context features for question detection, but the combination of features yields a 5% improvement in overall accuracy compared to linguistic features alone. Expand
Argument Component Classification for Classroom Discussions
This paper shows that an existing method for argument component classification developed for another educationally-oriented domain performs poorly on their dataset, and shows that feature sets from prior work on argument mining for student essays and online dialogues can be used to improve performance considerably. Expand
Multi-Task Self-Supervised Learning for Robust Speech Recognition
PASE+ is proposed, an improved version of PASE that better learns short- and long-term speech dynamics with an efficient combination of recurrent and convolutional networks and learns transferable representations suitable for highly mismatched acoustic conditions. Expand
Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks
Experiments show that the proposed improved self-supervised method can learn transferable, robust, and problem-agnostic features that carry on relevant information from the speech signal, such as speaker identity, phonemes, and even higher-level features such as emotional cues. Expand
Learning Sound Event Classifiers from Web Audio with Noisy Labels
Experiments suggest that training with large amounts of noisy data can outperform training with smaller amounts of carefully-labeled data, and it is shown that noise-robust loss functions can be effective in improving performance in presence of corrupted labels. Expand