Issue |
ITM Web Conf.
Volume 76, 2025
Harnessing Innovation for Sustainability in Computing and Engineering Solutions (ICSICE-2025)
|
|
---|---|---|
Article Number | 01003 | |
Number of page(s) | 9 | |
Section | Artificial Intelligence & Machine Learning | |
DOI | https://doi.org/10.1051/itmconf/20257601003 | |
Published online | 25 March 2025 |
Leveraging Artificial Neural Networks for Real-Time Speech Recognition in Voice-Activated Systems
1 Professor, Department of ECE, SIMATS Engineering, Saveetha Institute of Medical and Technical Sciences, Chennai, Tamil Nadu, India
2 Department of Computer Science and Engineering, MLR Institute of Technology, Hyderabad, Telangana, India
3 Assistant Professor, Department of Artificial Intelligence & Data Science, J.J.College of Engineering and Technology, Tiruchirappalli, Tamil Nadu, India
4 Department of Electrical and Electronics Engineering, RGM College of Engineering and Technology (Autonomous), Nandyal, Andhra Pradesh, India
5 Assistant Professor, Department of Electronics & Communication Engineering, University BDT College of Engineering, Davanagere, Karnataka, India
6 Assistant Professor, Department of ECE, New Prince Shri Bhavani College of Engineering and Technology Chennai, Tamil Nadu, India
sureshvekumar@gmail.com
bravindra64@gmail.com
sureshkumarcse2022@gmail.com
surianisetty@gmail.com
arunrajsr5@gmail.com
sheeba@newprinceshribhavani.com
It has further shaped the domain of real-time voice-controlled speech recognition systems. Things like language bias, computational expense and background noise have made strides more difficult in the past. This paper provides a novel view on these tasks, allowing for broader accessibility and real-world applicability of state-of-the-art models. We advocate a multi-dimensional methodology, including ad hoc model contextualization, tailored neural designs, and personalized learning strategies, to achieve voice and chip-height optimization. Existing speech recognition systems have a major limitation of being limited to few languages, dialects and accents. This study introduces a multilingual and multicultural model to address this issue and also provides access to the benefit of no-bias technologies to all parts of society. Out of these regional filters, it can work with more and work on them more precisely. In addition to this, the efficient structure also improves computational efficiency, meaning that the model is able to gain speed in real-time processing on low-power devices, fitting to rising demand for speech recognition in mobile and edge computing settings. Performance in contextual understanding remains a problem, errors in pronunciation or deviations from the dialect tend to lead to mistakes. As a result, the current study utilise semantic analysis and natural language processing (NLP) methods to assist understanding across different languages globally. These services are used in applications like medical/legal transcription or customer support, where correct transcription is critical. Additionally, the architecture enhances real-time processing by reducing latency and increasing responsiveness, which is critical in emergency response systems and autonomous vehicles where timely decision-making is crucial. By enhancing the efficiency and accuracy of ANN-based speech recognition, this research drives advancements in increasingly more accessible, effective, and reliable voice-activated technologies.
Key words: Artificial Neural Networks / Speech Recognition / Voice-Activated Systems / Real-Time Processing / Multilingual Support / Computational Efficiency
© The Authors, published by EDP Sciences, 2025
This is an Open Access article distributed under the terms of the Creative Commons Attribution License 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.
Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.
Initial download of the metrics may take a while.