A Room-Aware Intelligent Framework for Automated Digital Mixer Control Using Acoustic Modeling and OSC-Based Communication

Open Access

Issue		ITM Web Conf. Volume 82, 2026 International Conference on NextGen Engineering Technologies and Applications for Sustainable Development (ICNEXTS’25)


Article Number		02002
Number of page(s)		8
Section		Communication and Networking
DOI		https://doi.org/10.1051/itmconf/20268202002
Published online		04 February 2026

R. F. Gramaccioni, C. Marinoni, C. Chen, A. Uncini and D. Comminiello, “L3DAS23: Learning 3D Audio Sources for Audio-Visual Extended Reality,” in IEEE Open Journal of Signal Processing, vol. 5, pp. 632-640, 2024, doi: 10.1109/OJSP.2024.3376297. [Google Scholar]
C. Rinaldi et al., “Immersive Acoustics via Next-Generation Networks: Achieving High-Fidelity Audio in the Metaverse,” in IEEE Open Journal of the Communications Society, doi: 10.1109/OJCOMS.2025.3609467. [Google Scholar]
Y. Hadadi, H. Beit-On, V. Tourbabin, Z. Ben-Hur, D. L. Alon and B. Rafaely, “Blind Localization of Early Room Reflections Based on Microphone Arrays and Reverberant Speech,” in IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 3401-3413, 2025, doi: 10.1109/TASLPRO.2025.3594297. [Google Scholar]
A. Pawlak, H. Lee, A. Mäkivirta and T. Lund, “Spatial Analysis and Synthesis Methods: Subjective and Objective Evaluations Using Various Microphone Arrays in the Auralization of a Critical Listening Room,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 3986-4001, 2024, doi: 10.1109/TASLP.2024.3449037. [Google Scholar]
C. H. Heng, M. Toyoura, C. S. Leow and H. Nishizaki, “Analysis of Classroom Processes Based on Deep Learning With Video and Audio Features,” in IEEE Access, vol. 12, pp. 110705-110712, 2024, doi: 10.1109/ACCESS.2024.3434742. [Google Scholar]
Y. Hou and S. Denno, “WLAN Channel Status Duration Prediction for Audio and Video Services Using Probabilistic Neural Networks,” in IEEE Access, vol. 12, pp. 28201-28211, 2024, doi: 10.1109/ACCESS.2024.3365188. [Google Scholar]
V. Välimäki, K. Prawda and S. J. Schlecht, “Two-Stage Attenuation Filter for Artificial Reverberation,” in IEEE Signal Processing Letters, vol. 31, pp. 391-395, 2024, doi: 10.1109/LSP.2024.3352510. [Google Scholar]
M. Lugasi, J. Donley, A. Menon, V. Tourbabin and B. Rafaely, “Multi-Channel to Multi-Channel Noise Reduction and Reverberant Speech Preservation in Time-Varying Acoustic Scenes for Binaural Reproduction,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 3283-3295, 2024, doi: 10.1109/TASLP.2024.3416668. [Google Scholar]
M. Neri, A. Politis, D. A. Krause, M. Carli and T. Virtanen, “Speaker Distance Estimation in Enclosures From Single-Channel Audio,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 2242-2254, 2024, doi: 10.1109/TASLP.2024.3382504. [Google Scholar]
M. Jälmby, F. Elvander and T. van Waterschoot, “Multi-Channel Low-Rank Convolution of Jointly Compressed Room Impulse Responses,” in IEEE Open Journal of Signal Processing, vol. 5, pp. 850-857, 2024, doi: 10.1109/OJSP.2024.3410089. [Google Scholar]
M. Lemke, A. Hölter and S. Weinzierl, “Physics-Informed Identification and Interpolation of the Directivity of Sound Sources,” in IEEE Transactions on Audio, Speech and Language Processing, vol. 33, pp. 372-385, 2025, doi: 10.1109/TASLP.2024.3519872. [Google Scholar]
E. d'Olne, A. H. Moore, P. A. Naylor, J. Donley, V. Tourbabin and T. Lunner, “Group Conversations in Noisy Environments (GiN) – Multimedia Recordings for Location-Aware Speech Enhancement,” in IEEE Open Journal of Signal Processing, vol. 5, pp. 374-382, 2024, doi: 10.1109/OJSP.2023.3344379. [Google Scholar]
D. A. Krause, G. García-Barrios, A. Politis and A. Mesaros, “Binaural Sound Source Distance Estimation and Localization for a Moving Listener,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 996-1011, 2024, doi: 10.1109/TASLP.2023.3346297. [Google Scholar]
H. Raditya Pratama Roosadi, R. Rivai Ginanjar and D. Puji Lestari, “Indonesian Voice Cloning Text-to-Speech System With Vall-E-Based Model and Speech Enhancement,” in IEEE Access, vol. 12, pp. 193131-193140, 2024, doi: 10.1109/ACCESS.2024.3519870. [Google Scholar]
O. B. Zaken, A. Kumar, V. Tourbabin and B. Rafaely, “Neural-Network-Based Direction-of-Arrival Estimation for Reverberant Speech - The Importance of Energetic, Temporal, and Spatial Information,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 1298-1309, 2024, doi: 10.1109/TASLP.2024.3357037. [Google Scholar]
E. Seidel, P. Mowlaee and T. Fingscheidt, “Convergence and Performance Analysis of Classical, Hybrid, and Deep Acoustic Echo Control,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 2857-2870, 2024, doi: 10.1109/TASLP.2024.3402552. [Google Scholar]
N. L. Westhausen, H. Kayser, T. Jansen and B. T. Meyer, “Real-Time Multichannel Deep Speech Enhancement in Hearing Aids: Comparing Monaural and Binaural Processing in Complex Acoustic Scenarios,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 4596-4606, 2024, doi: 10.1109/TASLP.2024.3473315. [Google Scholar]
H. Yamashita et al., “Fast Neural Speech Waveform Generative Models With Fully-Connected Layer-Based Upsampling,” in IEEE Access, vol. 12, pp. 31409-31421, 2024, doi: 10.1109/ACCESS.2024.3366707. [Google Scholar]
C. Geishauser et al., “Learning With an Open Horizon in Ever-Changing Dialogue Circumstances,” in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 32, pp. 2352-2366, 2024, doi: 10.1109/TASLP.2024.3385289. [Google Scholar]
D. Lim, H. Kang, B. Choi, W. Hong and J. Lee, “An Interpersonal Dynamics Analysis Procedure With Accurate Voice Activity Detection Using Low-Cost Recording Sensors,” in IEEE Access, vol. 12, pp. 68427-68440, 2024, doi: 10.1109/ACCESS.2024.3387279. [Google Scholar]

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.