Enhancing Performance of End-to-End Gujarati Language ASR using combination of Integrated Feature Extraction and Improved Spell Corrector Algorithm

Open Access

Issue		ITM Web Conf. Volume 54, 2023 2^nd International Conference on Advances in Computing, Communication and Security (I3CS-2023)


Article Number		01016
Number of page(s)		8
Section		Computing
DOI		https://doi.org/10.1051/itmconf/20235401016
Published online		04 July 2023

Baker, James. “The DRAGON system--An overview.” IEEE Transactions on Acoustics, speech, and signal Processing 23.1 (1975): 24–29. [CrossRef] [Google Scholar]
Deshmukh, Akshay Madhav. “Comparison of hidden markov model and recurrent neural network in automatic speech recognition.” European Journal of Engineering and Technology Research 5.8 (2020): 958–965. [Google Scholar]
Forsberg, Markus. “Why is speech recognition difficult.” Chalmers University of Technology (2003). [Google Scholar]
Dua, Mohit. “Gujarati Language Automatic Speech Recognition Using Integrated Feature Extraction and Hybrid Acoustic Model.” Proceedings of Fourth International Conference on Communication, Computing and Electronics Systems: ICCCES 2022. Singapore: Springer Nature Singapore, 2023. [Google Scholar]
Chakravarty, Nidhi, and Mohit Dua. “Spoof Detection using Sequentially Integrated Image and Audio Features.” International Journal of Computing and Digital Systems 13.1 (2023): 1–1. [Google Scholar]
Amodei, Dario, et al. “Deep speech 2: End-to-end speech recognition in english and mandarin.” International conference on machine learning. PMLR, 2016. [Google Scholar]
Graves, Alex, et al. “Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks.” Proceedings of the 23rd international conference on Machine learning. 2006. [Google Scholar]
Bourlard, Herve A., and Nelson Morgan. Connectionist speech recognition: a hybrid approach. Vol. 247. Springer Science & Business Media, 1994. [CrossRef] [Google Scholar]
Raval, Deepang, et al. “Improving Deep Learning based Automatic Speech Recognition for Gujarati.” Transactions on Asian and Low-Resource Language Information Processing 21.3 (2021): 1–18. [Google Scholar]
Zhang, Shaohua, et al. “Spelling error correction with soft-masked BERT.” arXiv preprint arXiv:2005.07421 (2020). [Google Scholar]
Toshniwal, Shubham, et al. “Multilingual speech recognition with a single end-to- end model.” 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, 2018. [Google Scholar]
Billa, Jayadev. “ISI ASR System for the Low Resource Speech Recognition Challenge for Indian Languages.” INTERSPEECH. 2018. [Google Scholar]
Sak, Hasim, Andrew Senior, and Françoise Beaufays. “Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition.” arXiv preprint arXiv:1402.1128 (2014). [Google Scholar]
Schuster, M., and K.K. Paliwal. “Networks bidirectional reccurent neural.” IEEE Trans Signal Proces 45 (1997): 2673–2681. [CrossRef] [Google Scholar]
Graves, Alex, and Navdeep Jaitly. “Towards end-to-end speech recognition with recurrent neural networks.” International conference on machine learning. PMLR, 2014. [Google Scholar]
Hannun, Awni, et al. “Deep speech: Scaling up end-to-end speech recognition.” arXiv preprint arXiv:1412.5567 (2014). [Google Scholar]
Maas, Andrew, et al. “Lexicon-free conversational speech recognition with neural networks.” Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2015. [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.