Share This Author
Music type classification by spectral contrast feature
- Dan-Ning Jiang, Lie Lu, HongJiang Zhang, J. Tao, Lianhong Cai
- PhysicsProceedings. IEEE International Conference on…
- 7 November 2002
The octave-based spectral contrast feature is proposed to represent the spectral characteristics of a music clip and represented the relative spectral distribution instead of average spectral envelope.
Prosody conversion from neutral speech to emotional speech
- J. Tao, Yongguo Kang, Ai-jun Li
- Computer ScienceIEEE Transactions on Audio, Speech, and Language…
- 1 July 2006
The results support the use of a neutral semantic content text in databases for emotional speech synthesis by using "strong", "medium", and "weak" classifications.
CHEAVD: a Chinese natural emotional audio–visual database
- Ya Li, J. Tao, Linlin Chao, Wei Bao, Yazhu Liu
- Computer ScienceJ. Ambient Intell. Humaniz. Comput.
- 1 November 2017
This database is the first large-scale Chinese natural emotion corpus dealing with multimodal and natural emotion, and free to research use, and Automatic emotion recognition with Long Short-Term Memory Recurrent Neural Networks (LSTM-RNN) is performed on this corpus.
Affective Computing: A Review
The paper is emphasized on the several issues involved implicitly in the whole interactive feedback loop and various methods for each issue are discussed in order to examine the state of the art.
Micro-Expression Recognition Using Color Spaces
- Sujing Wang, Wen-Jing Yan, J. Tao
- Computer ScienceIEEE Transactions on Image Processing
- 30 October 2015
This paper proposes a novel color space model, tensor independent color space (TICS), to help recognize micro-expressions, and defines a set of regions of interests (ROIs) based on the facial action coding system and calculated the dynamic texture histograms for each ROI.
Design of Speech Corpus for Mandarin Text to Speech
The CASIA Mandarin corpus designed for Mandarin speech synthesis research has been carefully recorded by a professional female speaker under studio conditions and has been delivered to Blizzard Challenge 2008 as the common corpus for Mandarinspeech synthesis evaluation among all participants.
Long Short Term Memory Recurrent Neural Network based Multimodal Dimensional Emotion Recognition
- Linlin Chao, J. Tao, Minghao Yang, Ya Li, Zhengqi Wen
- Computer ScienceAVEC@ACM Multimedia
- 26 October 2015
This paper presents the effort to the Audio/Visual+ Emotion Challenge (AV+EC2015), whose goal is to predict the continuous values of the emotion dimensions arousal and valence from audio, visual and physiology modalities, and investigates two techniques for dimensional emotion recognition problem.
Multimodal Transformer Fusion for Continuous Emotion Recognition
- Jian Huang, J. Tao, B. Liu, Zheng Lian, Mingyue Niu
- Computer ScienceICASSP - IEEE International Conference on…
- 1 May 2020
The Transformer model is utilized to fuse audio-visual modalities on the model level to improve the performance of emotion recognition, and the superiority of model level fusion than other fusion strategies is shown.
Reconstruction of Partially Occluded Face by Fast Recursive PCA
This paper proposes a fast recursive PCA (principal component analysis) algorithm to remove face occlusions. In training phase, all faces are normalized by two eye centers and two mouth corners, and…
End-to-end keywords spotting based on connectionist temporal classification for Mandarin
- Ye Bai, Jiangyan Yi, J. Tao
- Computer Science10th International Symposium on Chinese Spoken…
- 14 October 2016
An end-to-end acoustic model based ASR for keywords spotting in Mandarin constructed by LSTM-RNN and trained with objective measure of connectionist temporal classification achieves a significant improvement on ATWV.