Sign Segmentation with Changepoint-Modulated Pseudo-Labelling

@article{Renz2021SignSW,
  title={Sign Segmentation with Changepoint-Modulated Pseudo-Labelling},
  author={Katrin Renz and N. Stache and Neil Fox and G{\"u}l Varol and Samuel Albanie},
  journal={2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)},
  year={2021},
  pages={3398-3407}
}
The objective of this work is to find temporal boundaries between signs in continuous sign language. Motivated by the paucity of annotation available for this task, we propose a simple yet effective algorithm to improve segmentation performance on unlabelled signing footage from a domain of interest. We make the following contributions: (1) We motivate and introduce the task of source-free domain adaptation for sign language segmentation, in which labelled source data is available for an… Expand

References

SHOWING 1-10 OF 66 REFERENCES
BSL-1K: Scaling up co-articulated sign language recognition using mouthing cues
TLDR
A new scalable approach to data collection for sign recognition in continuous videos is introduced, and it is shown that BSL-1K can be used to train strong sign recognition models for co-articulated signs in BSL and that these models additionally form excellent pretraining for other sign languages and benchmarks. Expand
Action Segmentation With Joint Self-Supervised Temporal Domain Adaptation
TLDR
SelfSupervised Temporal Domain Adaptation (SSTDA), which contains two self-supervised auxiliary tasks (binary and sequential domain prediction) to jointly align cross-domain feature spaces embedded with local and global temporal dynamics, achieving better performance than other Domainadaptation (DA) approaches. Expand
Automatic Segmentation of Sign Language into Subtitle-Units
TLDR
A corpus of natural Sign Language video with accurately aligned subtitles is used to train a spatio-temporal graph convolutional network with a BiLSTM on 2D skeleton data to automatically detect the temporal boundaries of subtitles to segment Sign language video into subtitle-units that can be translated into phrases in a written language. Expand
Unsupervised Domain Adaptation for Semantic Segmentation via Class-Balanced Self-training
TLDR
This paper proposes a novel UDA framework based on an iterative self-training (ST) procedure, where the problem is formulated as latent variable loss minimization, and can be solved by alternatively generating pseudo labels on target data and re-training the model with these labels. Expand
Learning Motion Disfluencies for Automatic Sign Language Segmentation
  • Iva Farag, H. Brock
  • Computer Science
  • ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2019
TLDR
A novel technique for the automatic detection of word boundaries within continuous sentence expressions in Japanese Sign Language from three-dimensional body joint positions is introduced that can easily be adapted to distinguish between motion transitions and motion primitives for a coarse-action domain. Expand
Automatic sign segmentation from continuous signing via multiple sequence alignment
TLDR
An unsupervised, multiple alignment-based approach for sign segmentation, using low level shape descriptors is suitable for the alignment task, and the highest accuracy is obtained by modeling the signs with HMM using the intervals found previously by DTW. Expand
MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation
TLDR
A multi-stage architecture for the temporal action segmentation task that achieves state-of-the-art results on three challenging datasets: 50Salads, Georgia Tech Egocentric Activities (GTEA), and the Breakfast dataset. Expand
Re-Sign: Re-Aligned End-to-End Sequence Modelling with Deep Recurrent CNN-HMMs
TLDR
This work proposes an algorithm that treats the provided training labels as weak labels and refines the label-to-image alignment on-the-fly in a weakly supervised fashion, and embedded into an HMM the resulting deep model continuously improves its performance in several re-alignments. Expand
Transferring Cross-Domain Knowledge for Video Sign Language Recognition
TLDR
A novel method is proposed that learns domain-invariant visual concepts and fertilizes WSLR models by transferring knowledge of subtitled news sign to them, and outperforms previous state-of-the-art methods significantly. Expand
Sign Language Detection “in the Wild” with Recurrent Neural Networks
  • M. Borg, K. Camilleri
  • Computer Science
  • ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2019
TLDR
A multi-layer RNN for sign language detection using features extracted automatically from a 2-stream convolutional neural network (CNN) that takes video image data and motion data as input achieves an improvement of around 18%, indicating that the network is able to leverage dynamic information of hand motion during detection. Expand
...
1
2
3
4
5
...