AP19-OLR Challenge: Three Tasks and Their Baselines

@article{Tang2019AP19OLRCT,
  title={AP19-OLR Challenge: Three Tasks and Their Baselines},
  author={Zhiyuan Tang and Dong Wang and Liming Song},
  journal={2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)},
  year={2019},
  pages={1917-1921}
}
  • Zhiyuan Tang, D. Wang, Liming Song
  • Published 2019
  • Computer Science, Engineering
  • 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
This paper introduces the fourth oriental language recognition (OLR) challenge AP19-OLR, including the data profile, the tasks and the evaluation principles. The OLR challenge has been held successfully for three consecutive years, along with APSIPA Annual Summit and Conference (APSIPA ASC). The challenge this year still focuses on practical and challenging tasks, precisely (1) short-utterance LID, (2) cross-channel LID and (3) zero-resource LID. The event this year includes more languages and… Expand
AP20-OLR Challenge: Three Tasks and Their Baselines
  • Z. Li, Miao Zhao, +5 authors Cheng Yang
  • Computer Science
  • 2020 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
  • 2020
TLDR
The fifth oriental language recognition (OLR) challenge AP20-OLR is introduced, which intends to improve the performance of language recognition systems, along with APSIPA Annual Summit and Conference (APSIPA ASC). Expand
AP18-OLR Challenge: Three Tasks and Their Baselines
  • Zhiyuan Tang, D. Wang, Q. Chen
  • Computer Science, Engineering
  • 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
  • 2018
TLDR
The third oriental language recognition (OLR) challenge AP18-OLR is introduced in this paper, including the data profile, the tasks and the evaluation principles, and it is demonstrated that the three tasks are truly challenging. Expand
OLR 2021 Challenge: Datasets, Rules and Baselines
  • Binling Wang, Wenxuan Hu, +7 authors Cheng Yang
  • Computer Science, Engineering
  • ArXiv
  • 2021
TLDR
The sixth Oriental Language Recognition 2021 Challenge, which intends to improve the performance of language recognition systems and speech recognition systems within multilingual scenarios, is introduced, with the data profile, four tasks, two baselines, and the evaluation principles introduced. Expand
The XMUSPEECH System for the AP19-OLR Challenge
TLDR
This paper presents the XMUSPEECH system, a system for the oriental language recognition (OLR) challenge, AP19-OLR, which leveraged the system pipeline from three aspects, including front-end training, back-end processing, and fusion strategy. Expand
C L ] 2 3 Ju l 2 02 1 OLR 2021 CHALLENGE : DATASETS , RULES AND BASELINES
This paper introduces the sixth Oriental Language Recognition (OLR) 2021 Challenge, which intends to improve the performance of language recognition systems and speech recognition systems withinExpand
Oriental Language Recognition (OLR) 2020: Summary and Analysis
  • Jing Li, Binling Wang, +4 authors Dong Wang
  • Computer Science, Engineering
  • Interspeech 2021
  • 2021
TLDR
This paper describes the three tasks, the database profile, and the final results, and outlines the novel approaches that improve the performance of language recognition systems most significantly, such as the utilization of auxiliary information. Expand
Unsupervised Neural Adaptation Model Based on Optimal Transport for Spoken Language Identification
  • Xugang Lu, Peng Shen, Yu Tsao, H. Kawai
  • Computer Science, Engineering
  • ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2021
TLDR
An unsupervised neural adaptation model is proposed to deal with the distribution mismatch problem for SLID and a Wasserstein distance metric is designed in the adaptation loss to reduce the distribution discrepancy on both feature and classifier for training and testing data sets. Expand
Releasing a Toolkit and Comparing the Performance of Language Embeddings Across Various Spoken Language Identification Datasets
TLDR
A software toolkit for easier end-toend training of deep learning based spoken language identification models across several speech datasets is proposed and it is discovered that increasing x-vector model robustness with random frequency channel dropout significantly reduces its end- to-end classification performance on the test set, while not affecting back- end classification performance of its embeddings. Expand
Spoken Language Identification under Emotional Speech Variations
Identifying language information from speech utterance is referred to as spoken language identification. Language Identification (LID) is essential in multilingual speech systems. There are variousExpand
Additive Phoneme-aware Margin Softmax Loss for Language Recognition
  • Zheng Li, Yan Liu, Lin Li, Q. Hong
  • Computer Science, Engineering
  • Interspeech 2021
  • 2021
TLDR
An APM-Softmax loss for language recognition with phoneitc multi-task learning is proposed, in which the additive phoneme-aware margin is automatically tuned for different training samples, and the margin of language recognition is adjusted according to the results of phoneme recognition. Expand
...
1
2
...

References

SHOWING 1-10 OF 23 REFERENCES
AP18-OLR Challenge: Three Tasks and Their Baselines
  • Zhiyuan Tang, D. Wang, Q. Chen
  • Computer Science, Engineering
  • 2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)
  • 2018
TLDR
The third oriental language recognition (OLR) challenge AP18-OLR is introduced in this paper, including the data profile, the tasks and the evaluation principles, and it is demonstrated that the three tasks are truly challenging. Expand
AP17-OLR challenge: Data, plan, and baseline
TLDR
The baseline results are evaluated with various metrics defined by the AP17-OLR evaluation plan and it is demonstrated that the combined database is a reasonable data resource for multilingual research. Expand
AP16-OL7: A multilingual database for oriental languages and a language recognition baseline
TLDR
It is demonstrated that AP16-OL7 is a reasonable data resource for multilingual research and a baseline system was constructed on the basis of the i-vector model to evaluate the baseline results. Expand
M2ASR: Ambitions and first year progress
  • Dong Wang, T. Zheng, +9 authors Gulnigar Mahmut
  • Computer Science, Political Science
  • 2017 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment (O-COCOSDA)
  • 2017
TLDR
This project intends to publish all the achievements and make them free for the research community, including speech and text corpora, phone sets, lexicons, tools, recipes and prototype systems, and present the future plan. Expand
State-of-the-Art Speaker Recognition for Telephone and Video Speech: The JHU-MIT Submission for NIST SRE18
TLDR
Very deep xvector architectures–Extended and Factorized TDNN, and ResNets– clearly outperformed shallower xvectors and i-vectors in NIST SRE18, and Extended TDNN x-vector was the best single system. Expand
Transfer learning for speech and language processing
  • Dong Wang, T. Zheng
  • Computer Science
  • 2015 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)
  • 2015
TLDR
This review paper summarizes some recent prominent research towards transfer learning, particularly for speech and language processing, and highlights the potential of this very interesting research field. Expand
X-Vectors: Robust DNN Embeddings for Speaker Recognition
TLDR
This paper uses data augmentation, consisting of added noise and reverberation, as an inexpensive method to multiply the amount of training data and improve robustness of deep neural network embeddings for speaker recognition. Expand
Spoken Language Recognition using X-vectors
TLDR
In this paper, this framework consists of a deep neural network that maps sequences of speech features to fixed-dimensional embeddings, called x-vectors, and finds that the best performing system uses multilingual bottleneck features, data augmentation, and a discriminative Gaussian classifier. Expand
The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech
This paper describes the fifth edition of the Multi-Genre Broadcast Challenge (MGB-5), an evaluation focused on Arabic speech recognition and dialect identification. MGB-5 extends the previous MGB-3Expand
THCHS-30 : A Free Chinese Speech Corpus
TLDR
This paper releases a free Chinese speech database THCHS-30 that can be used to build a full- edged Chinese speech recognition system, and reports the baseline system established with this database, including the performance under highly noisy conditions. Expand
...
1
2
3
...