GitHub - modelscope/ClearerVoice-Studio: An AI-Powered Speech Processing Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Enhancement, Separation, and Target Speaker Extraction, etc.

ClearerVoice-Studio is an open-source, AI-powered speech processing toolkit designed for researchers, developers, and end-users. It provides capabilities of speech enhancement, speech separation, target speaker extraction, and more. The toolkit provides state-of-the-art pre-trained models, along with training and inference scripts, all accessible from this repository.

👉🏻ClearVoice Demo👈🏻 | 👉🏻SpeechScore Demo👈🏻

Please support our community project 💖 by starring it on GitHub 加⭐支持 🙏

News 🔥

[2024.12] Upload pre-trained models on ModelScope. User now can download the models from either ModelScope or Huggingface
[2024.11] FRCRN speech denoiser has been used over 2.8 million times on ModelScope
[2024.11] MossFormer speech separator has been used over 2.5 million times on ModelScope
[2024.11] Release of this repository
Upcoming: More tasks will be added to ClearVoice.

🌟 Why Choose ClearerVoice-Studio?

Pre-Trained Models: Includes cutting-edge pre-trained models, fine-tuned on extensive, high-quality datasets. No need to start from scratch!
Ease of Use: Designed for seamless integration with your projects, offering a simple yet flexible interface for inference and training.
Comprehensive Features: Combines advanced algorithms for multiple speech processing tasks in one platform.
Community-Driven: Built for researchers, developers, and enthusiasts to collaborate and innovate together.

Contents of this repository

This repository is organized into three main components: ClearVoice, Train, and SpeechScore.

1. ClearVoice

ClearVoice offers a user-friendly solution for speech processing tasks such as speech denoising, separation, audio-visual target speaker extraction, and more. It is designed as a unified inference platform leveraged pre-trained models (e.g., FRCRN, MossFormer), all trained on extensive datasets. If you're looking for a tool to improve speech quality, ClearVoice is the perfect choice. Simply click on ClearVoice and follow our detailed instructions to get started.

2. Train

For advanced researchers and developers, we provide model finetune and training scripts for all the tasks offerred in ClearVoice and more:

Task 1: Speech enhancement (16kHz & 48kHz)
Task 2: Speech separation (8kHz & 16kHz)
Task 3: Target speaker extraction
- Sub-Task 1: Audio-only Speaker Extraction Conditioned on a Reference Speech (8kHz)
- Sub-Task 2: Audio-visual Speaker Extraction Conditioned on Face (Lip) Recording (16kHz)
- Sub-Task 3: Audio-visual Speaker Extraction Conditioned on Body Gestures (16kHz)
- Sub-Task 4: Neuro-steered Speaker Extraction Conditioned on EEG Signals (16kHz)

Contributors are welcomed to include more model architectures and tasks!

3. SpeechScore

SpeechScore is a speech quality assessment toolkit. We include it here to evaluate different model performance. SpeechScore includes many popular speech metrics:

Signal-to-Noise Ratio (SNR)
Perceptual Evaluation of Speech Quality (PESQ)
Short-Time Objective Intelligibility (STOI)
Deep Noise Suppression Mean Opinion Score (DNSMOS)
Scale-Invariant Signal-to-Distortion Ratio (SI-SDR)
and many more quality benchmarks

Contact

If you have any comments or questions about ClearerVoice-Studio, feel free to raise an issue in this repository or contact us directly at:

email: {shengkui.zhao, zexu.pan}@alibaba-inc.com

Alternatively, welcome to join our DingTalk and WeChat groups to share and discuss algorithms, technology, and user experience feedback. You may scan the following QR codes to join our official chat groups accordingly.

Friend Links

Checkout some awesome Github repositories from Speech Lab of Institute for Intelligent Computing, Alibaba Group.

Acknowledge

ClearerVoice-Studio contains third-party components and code modified from some open-source repos, including:
Speechbrain, ESPnet, TalkNet-ASD

Name		Name	Last commit message	Last commit date
Latest commit History 186 Commits
asset		asset
clearvoice		clearvoice
speechscore		speechscore
train		train
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

👉🏻ClearVoice Demo👈🏻 | 👉🏻SpeechScore Demo👈🏻

News 🔥

🌟 Why Choose ClearerVoice-Studio?

Contents of this repository

1. ClearVoice

2. Train

3. SpeechScore

Contact

Friend Links

Acknowledge

About

Releases

Packages

Contributors 4

Languages

License

modelscope/ClearerVoice-Studio

Folders and files

Latest commit

History

Repository files navigation

👉🏻ClearVoice Demo👈🏻 | 👉🏻SpeechScore Demo👈🏻

News 🔥

🌟 Why Choose ClearerVoice-Studio?

Contents of this repository

1. ClearVoice

2. Train

3. SpeechScore

Contact

Friend Links

Acknowledge

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages