Speech Recognition Module in Python

Multimodal Speech Recognition Assisted by Slide Information in Classroom Scenes

Abstract: Multimodal automatic speech recognition (ASR) technology has attracted much attention because it improves the accuracy of speech recognition by adding other modal information. However, most ...

GitHub

DePasqualeOrg/mlx-audio-plus

The best audio processing library built on Apple's MLX framework, providing fast and efficient text-to-speech (TTS), speech-to-text (STT), and speech-to-speech (STS) on Apple Silicon. Kokoro Fast, ...

IEEE

Leo: Personal Desktop Assistant Using Python

Abstract: Personal assistants or the desktop assistant have proven to be very useful in daily life as they made our work easier. If the user wants to perform some action without using their hands, ...

VentureBeat

Mistral drops Voxtral Transcribe 2, an open-source speech model that runs on-device for pennies

Mistral AI, the Paris-based startup positioning itself as Europe's answer to OpenAI, released a pair of speech-to-text models on Wednesday that the company says can transcribe audio faster, more ...

Microsoft

Paza: Introducing automatic speech recognition benchmarks and models for low resource languages

According to the 2025 Microsoft AI Diffusion Report approximately one in six people globally had used a generative AI product. Yet for billions of people, the promise of voice interaction still falls ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results