
Select an Action

Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-term Memory, Deep Neural Networks
Title:
Non-linguistic Vocalization Recognition Based on Convolutional, Long Short-term Memory, Deep Neural Networks
Author:
Qiu, Liang, author.
ISBN:
9780438068957
Personal Author:
Physical Description:
1 electronic resource (46 pages)
General Note:
Source: Masters Abstracts International, Volume: 57-06M(E).
Advisors: Lei He Committee members: Abeer A H Alwan; Song-Chun Zhu.
Abstract:
Non-linguistic Vocalization Recognition refers to the detection and classification of non-speech voice such as laughter, sneeze, cough, cry, screaming, etc. It could be seen as a subtask of Acoustic Event Detection (AED). Great progress has been made by previous research to increase the accuracy of AED. On the front end, multiple kinds of features such as Mel-Frequency Cepstral Coefficients (MFCCs), Gammatone Cepstral Coefficients (GTCCs) and many other hand-crafted features were explored. While on the back end, models or methods such as Gaussian Mixture Models (GMMs), Hidden Markov Models (HMMs), Bags-of-Audio-Words (BoAW), Support Vector Machine (SVM) and various types of neural networks were experimented.
Recent researches on Automatic Speech Recognition (ASR) and Acoustic Scene Classification (ASC) show the advantage of using Convolutional, Long Short-Term Memory, Deep Neural Networks (CLDNNs) on audio processing tasks. In this thesis, I am building a non-linguistic vocalization recognition system using CLDNNs. Log Mel-filterbank coefficients are adopted as input features and data augmentation methods such as random shifting and noise mixture are discussed. The built system is evaluated on a custom dataset collected from several resources and tested for real time application. The performance of CLDNNs for non-linguistic vocalization recognition is also compared with hybrid GMM-SVMs, Convolutional Neural Networks, Long Short-Term Memory and a fully connected Deep Neural Network trained on VGGish embeddings.
The results indicate that CLDNNs outperform the other models in classification precision and recall. Visualization of CLDNNs are presented to help understand the framework. The model is proved accurate and fast enough for real time applications.
Local Note:
School code: 0031
Added Corporate Author:
Available:*
Shelf Number | Item Barcode | Shelf Location | Status |
|---|---|---|---|
| XX(694939.1) | 694939-1001 | Proquest E-Thesis Collection | Searching... |
On Order
Select a list
Make this your default list.
The following items were successfully added.
There was an error while adding the following items. Please try again.
:
Select An Item
Data usage warning: You will receive one text message for each title you selected.
Standard text messaging rates apply.


