• Corpus ID: 243832676

BBC-Oxford British Sign Language Dataset

  title={BBC-Oxford British Sign Language Dataset},
  author={Samuel Albanie and G{\"u}l Varol and Liliane Momeni and Hannah Bull and Triantafyllos Afouras and Himel Chowdhury and Neil Fox and Bencie Woll and Robert J. Cooper and Andrew McParland and Andrew Zisserman},
In this work, we introduce the BBC-Oxford British Sign Language (BOBSL) dataset, a large-scale video collection of British Sign Language (BSL). BOBSL is an extended and publicly released dataset based on the BSL-1K dataset [1] introduced in previous work. We describe the motivation for the dataset, together with statistics and available annotations. We conduct experiments to provide baselines for the tasks of sign recognition, sign language alignment, and sign language translation. Finally, we… 


BosphorusSign22k Sign Language Recognition Dataset
The primary objective of this dataset is to serve as a new benchmark in Turkish Sign Language Recognition for its vast lexicon, the high number of repetitions by native signers, high recording quality, and the unique syntactic properties of the signs it encompasses.
MS-ASL: A Large-Scale Data Set and Benchmark for Understanding American Sign Language
This work proposes the first real-life large-scale sign language data set comprising over 25,000 annotated videos, which it thoroughly evaluates with state-of-the-art methods from sign and related action recognition, outperforming the current state of theart by a large margin.
SMILE Swiss German Sign Language Dataset
A large-scale dataset containing videotaped repeated productions of the 100 items of the vocabulary test with associated transcriptions and annotations was created, consisting of data from 11 adult L1 signers and 19 adult L2 learners of DSGS, which is made available to the research community.
Neural Sign Language Translation
This work formalizes SLT in the framework of Neural Machine Translation (NMT) for both end-to-end and pretrained settings (using expert knowledge) and allows to jointly learn the spatial representations, the underlying language model, and the mapping between sign and spoken language.
INCLUDE: A Large Scale Dataset for Indian Sign Language Recognition
This work presents the Indian Lexicon Sign Language Dataset - INCLUDE - an ISL dataset that contains 0.27 million frames across 4,287 videos over 263 word signs from 15 different word categories and evaluates several deep neural networks combining different methods for augmentation, feature extraction, encoding and decoding.
Building the British Sign Language Corpus
The first endeavor ever to create a machine-readable digital corpus of British Sign Language (BSL) collected from deaf signers across the United Kingdom is presented, which represents a unique combination of methodology from variationist sociolinguistics and corpus linguistics.
Read and Attend: Temporal Localisation in Sign Language Videos
A Transformer model is trained to ingest a continuous signing stream and output a sequence of written tokens on a large-scale collection of signing footage with weakly-aligned subtitles, and it is shown that through this training it acquires the ability to attend to a large vocabulary of sign instances in the input sequence, enabling their localisation.
Improving Sign Language Translation with Monolingual Data by Sign Back-Translation
The proposed sign back-translation (SignBT) approach, which incorporates massive spoken language texts into SLT training, and obtains a substantial improvement over previous state-of-the-art SLT methods.
The American Sign Language Lexicon Video Dataset
The ASL lexicon video dataset is introduced, a large and expanding public dataset containing video sequences of thousands of distinct ASL signs, as well as annotations of those sequences, including start/end frames and class label of every sign.
Word-level Deep Sign Language Recognition from Video: A New Large-scale Dataset and Methods Comparison
This paper introduces a new large-scale Word-Level American Sign Language (WLASL) video dataset, containing more than 2000 words performed by over 100 signers, and proposes a novel pose-based temporal graph convolution networks (Pose-TGCN) that model spatial and temporal dependencies in human pose trajectories simultaneously, which has further boosted the performance of the pose- based method.