Weakly-supervised Fingerspelling Recognition in British Sign Language Videos
@inproceedings{KR2022WeaklysupervisedFR, title={Weakly-supervised Fingerspelling Recognition in British Sign Language Videos}, author={Prajwal K R and Hannah Bull and Liliane Momeni and Samuel Albanie and G{\"u}l Varol and Andrew Zisserman}, booktitle={British Machine Vision Conference}, year={2022} }
The goal of this work is to detect and recognize sequences of letters signed using fingerspelling in British Sign Language (BSL). Previous fingerspelling recognition methods have not focused on BSL, which has a very different signing alphabet (e.g., two-handed instead of one-handed) to American Sign Language (ASL). They also use manual annotations for training. In contrast to previous methods, our method only uses weak annotations from subtitles for training. We localize potential instances of…
Figures and Tables from this paper
References
SHOWING 1-10 OF 60 REFERENCES
American Sign Language Fingerspelling Recognition in the Wild
- Computer Science2018 IEEE Spoken Language Technology Workshop (SLT)
- 2018
This work introduces the largest data set available so far for the problem of fingerspelling recognition, and the first using naturally occurring video data, and presents the first attempt to recognize fingerspelling sequences in this challenging setting.
Lexicon-free fingerspelling recognition from video: Data, models, and signer adaptation
- Computer ScienceComput. Speech Lang.
- 2017
Searching for fingerspelled content in American Sign Language
- Computer ScienceACL
- 2022
This paper proposes an end-to-end model for this task, FSS-Net, that jointly detects fingerspelling and matches it to a text sequence and significantly outperforms baseline methods adapted from prior work on related tasks.
Fingerspelling Recognition in the Wild With Iterative Visual Attention
- Computer Science2019 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2019
This work proposes an end-to-end model based on an iterative attention mechanism, without explicit hand detection or segmentation, that out-performs prior work by a large margin on recognition of fingerspelling sequences in ASL videos collected in the wild.
Multitask training with unlabeled data for end-to-end sign language fingerspelling recognition
- Computer Science2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)
- 2017
A model for fingerspelling recognition that consists of an auto-encoder-based feature extractor and an attention-based neural encoder-decoder, which are trained jointly, achieves 11.6% and 4.4% absolute letter accuracy improvement over previous approaches that required frame-level training labels.
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues
- Computer ScienceECCV
- 2020
A new scalable approach to data collection for sign recognition in continuous videos is introduced, and it is shown that BSL-1K can be used to train strong sign recognition models for co-articulated signs in BSL and that these models additionally form excellent pretraining for other sign languages and benchmarks.
Sign Language Fingerspelling Recognition using Synthetic Data
- Computer ScienceAICS
- 2021
This model is based on a pretrained convolutional network, fine-tuned using synthetic images, and tested using a corpus dataset of real recordings of native signers, achieving an accuracy of 71% recognition.
Learning sign language by watching TV (using weakly aligned subtitles)
- Computer Science2009 IEEE Conference on Computer Vision and Pattern Recognition
- 2009
This work proposes a distance function to match signing sequences which includes the trajectory of both hands, the hand shape and orientation, and properly models the case of hands touching and shows that by optimizing a scoring function based on multiple instance learning, it is able to extract the sign of interest from hours of signing footage, despite the very weak and noisy supervision.
Automatic recognition of fingerspelled words in British Sign Language
- Computer Science2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops
- 2009
This work investigates the problem of recognizing words from video, fingerspelled using the British Sign Language (BSL) fingerspelling alphabet, and achieves a word recognition accuracy of 98.9% on a dataset of 1,000 low quality webcam videos of 100 words.
Aligning Subtitles in Sign Language Videos
- Computer Science2021 IEEE/CVF International Conference on Computer Vision (ICCV)
- 2021
This work proposes a Transformer architecture tailored for this task, which trains on manually annotated alignments covering over 15K subtitles that span 17.7 hours of video, and opens up possibilities for advancing machine translation of sign languages via providing continuously synchronized video-text data.