International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
Volume 183 - Issue 25 |
Published: Sep 2021 |
Authors: Atharva Bankar, Aryan Gandhi, Dipali Baviskar |
![]() |
Atharva Bankar, Aryan Gandhi, Dipali Baviskar . Image and Signal Processing of Mel-Spectrograms in Isolated Speech Recognition. International Journal of Computer Applications. 183, 25 (Sep 2021), 11-17. DOI=10.5120/ijca2021921625
@article{ 10.5120/ijca2021921625, author = { Atharva Bankar,Aryan Gandhi,Dipali Baviskar }, title = { Image and Signal Processing of Mel-Spectrograms in Isolated Speech Recognition }, journal = { International Journal of Computer Applications }, year = { 2021 }, volume = { 183 }, number = { 25 }, pages = { 11-17 }, doi = { 10.5120/ijca2021921625 }, publisher = { Foundation of Computer Science (FCS), NY, USA } }
%0 Journal Article %D 2021 %A Atharva Bankar %A Aryan Gandhi %A Dipali Baviskar %T Image and Signal Processing of Mel-Spectrograms in Isolated Speech Recognition%T %J International Journal of Computer Applications %V 183 %N 25 %P 11-17 %R 10.5120/ijca2021921625 %I Foundation of Computer Science (FCS), NY, USA
One of the fundamental modes of communication is speech. In the past decade, many advances in the field of speech recognition system have been recorded. The conversion of acoustic waveforms into human understandable texts is the basic idea behind these systems. In this paper, an automatic speech recognition (speech-to-text) system is modelled which recognizes isolated words (one at a time). The word predictions are made based on two methods, namely Image Processing and Signal Processing. This paper presents the idea of a speech recognition system for the fundamental progress of speech recognition and also gives an overview of techniques used in each stage of speech recognition. Moreover, a comparative analysis on basis of accuracy and computation time is done. The techniques showcased in this study are used for feature extraction and then used to identify 30 spoken commands using convolutional neural networks (CNNs).