Abstract: Audio feature selection and neural network architecture play crucial roles in speech recognition performance. This paper presents a comparative analysis of Artificial Neural Networks (ANNs) ...
By Atharva Agrawal Growing up in the Tiger Capital of India, Nagpur, a city surrounded by some of the country’s most eminent wildlife sanctuaries, including Pen ...
This repo contains code for our DCASE 2025 task3 proposed method : Stereo sound event localization and detection based on PSELDnet pretraining and BiMamba sequence modeling [1]. For more information, ...
A complete video subtitle translation pipeline with modern web interface that uses OpenAI Whisper for speech-to-text transcription and Google Translate for multi-language subtitle generation.
Abstract: In recent years, environmental sound classification has become an essential component in intelligent urban monitoring systems, smart infrastructure, and public noise analysis. However, this ...