Speech Recognition Code Python

DePasqualeOrg/mlx-audio-plus

The best audio processing library built on Apple's MLX framework, providing fast and efficient text-to-speech (TTS), speech-to-text (STT), and speech-to-speech (STS) on Apple Silicon. Kokoro Fast, ...

IEEE

A Knowledge Distillation-Based Approach to Speech Emotion Recognition

Abstract: Due to rapid advancements in deep learning, Transformer-based architectures have proven effective in speech emotion recognition (SER), largely due to their ability to model long-term ...

IEEE

Universal Robust Speech Adaptation for Cross-Domain Speech Recognition and Enhancement

Abstract: Pre-trained models for automatic speech recognition (ASR) and speech enhancement (SE) have exhibited remarkable capabilities under matched noise and channel conditions. However, these models ...

GitHub

Fine-Grained Emotion Recognition via In-Context Learning

This repository contains the code for the paper "For more detailed results and analysis, please refer to: Fine-Grained Emotion Recognition via In-Context Learning" published at CIKM 2025. Extract ...

VentureBeat

Mistral drops Voxtral Transcribe 2, an open-source speech model that runs on-device for pennies

Mistral AI, the Paris-based startup positioning itself as Europe's answer to OpenAI, released a pair of speech-to-text models on Wednesday that the company says can transcribe audio faster, more ...

Microsoft

Paza: Introducing automatic speech recognition benchmarks and models for low resource languages

According to the 2025 Microsoft AI Diffusion Report approximately one in six people globally had used a generative AI product. Yet for billions of people, the promise of voice interaction still falls ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results