Design and Implementation of Fast Spoken Foul Language Recognition with Different End-to-End Deep Neural Network Architectures

  title={Design and Implementation of Fast Spoken Foul Language Recognition with Different End-to-End Deep Neural Network Architectures},
  author={Abdulaziz Saleh Ba Wazir and Hezerul Abdul Karim and Mohd Haris Lye Abdullah and Nouar Aldahoul and Sarina Mansor and Mohammad Faizal Ahmad Fauzi and John See and Ahmad Syazwan Naim},
  journal={Sensors (Basel, Switzerland)},
Given the excessive foul language identified in audio and video files and the detrimental consequences to an individual’s character and behaviour, content censorship is crucial to filter profanities from young viewers with higher exposure to uncensored content. Although manual detection and censorship were implemented, the methods proved tedious. Inevitably, misidentifications involving foul language owing to human weariness and the low performance in human visual systems concerning long… 
The Sustainable Development of Intangible Cultural Heritage with AI: Cantonese Opera Singing Genre Classification Based on CoGCNet Model in China
A classification method based on the Cantonese opera Genre Classification Networks (CoGCNet) model, which has high classification accuracy, and the overall performance is better than that of the commonly used neural network models, provides a new feasible idea for the sustainable development of the study on the singing characteristics of the Cantonse opera genres.
Design and Implementation of Online Intelligent Mental Health Testing Platform
It is proved that the method of the mental health intelligent evaluation system of the decision tree algorithm can effectively solve the problem and improve the accuracy of the Mental Health intelligent evaluation.


Spectrogram-Based Classification Of Spoken Foul Language Using Deep CNN
  • A. Wazir, H. A. Karim, John See
  • Computer Science
    2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP)
  • 2020
An intelligent model for foul language censorship through automated and robust detection by deep convolutional neural networks (CNNs) is proposed and outperforms other models in terms of accuracy, sensitivity, specificity, F1-score.
End-to-end attention-based large vocabulary speech recognition
This work investigates an alternative method for sequence modelling based on an attention mechanism that allows a Recurrent Neural Network (RNN) to learn alignments between sequences of input frames and output labels.
Acoustic Pornography Recognition Using Recurrent Neural Network
The experimental results confirm the feasibility of the proposed acoustic-driven approach by demonstrating an accuracy of 86.50%, and F-score of 86-score, in the task of pornography recognition.
Spoken Arabic Digits Recognition Using Deep Learning
This research proposes an Arabic digits speech recognition model utilizing Recurrent Neural Network (RNN), which has the highest accuracy, i.e. 80%, when recognizing the digit zero.
Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin
It is shown that an end-to-end deep learning approach can be used to recognize either English or Mandarin Chinese speech-two vastly different languages, and is competitive with the transcription of human workers when benchmarked on standard datasets.
Deep-Net: A Lightweight CNN-Based Speech Emotion Recognition System Using Deep Frequency Features
A new lightweight effective speech emotion recognition model that has a low computational complexity and a high recognition accuracy is proposed that can achieve a better recognition performance than the state-of-the-art SER systems.
Robust sound event recognition using convolutional neural networks
This work proposes novel features derived from spectrogram energy triggering, allied with the powerful classification capabilities of a convolutional neural network (CNN), which demonstrates excellent performance under noise-corrupted conditions when compared against state-of-the-art approaches on standard evaluation tasks.
Very deep convolutional networks for end-to-end speech recognition
This work successively train very deep convolutional networks to add more expressive power and better generalization for end-to-end ASR models, and applies network-in-network principles, batch normalization, residual connections and convolutionAL LSTMs to build very deep recurrent and Convolutional structures.
State-of-the-Art Speech Recognition with Sequence-to-Sequence Models
  • C. Chiu, T. Sainath, M. Bacchiani
  • Computer Science
    2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
  • 2018
A variety of structural and optimization improvements to the Listen, Attend, and Spell model are explored, which significantly improve performance and a multi-head attention architecture is introduced, which offers improvements over the commonly-used single- head attention.