Abstract: The goal of this paper is to generate realistic audio with a lightweight and fast diffusion-based vocoder, named FreGrad. Our framework consists of the following three key components: (1) We ...
Abstract: Emotional voice conversion (EVC) transforms the emotional state of speech while preserving linguistic content and speaker identity. Although sequence-to-sequence models have achieved ...