Real-Time Voice Cloning

5秒钟内克隆一个声音，实时生成任意语音。「Clone a voice in 5 seconds to generate arbitrary speech in real-time」

Owner: Corentin Jemine projecs by owner (0)
Platform: Linux, Mac, Windows
License:: Other
Category::

Python

Deep learning
Topic:

python

deep-learning

tensorflow

pytorch

tts

voice-cloning
Like:

5

Compare:

Github stars Tracking Chart

Real-Time Voice Cloning

This repository is an implementation of Transfer Learning from Speaker Verification to
Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. Feel free to check my thesis if you're curious or if you're looking for info I haven't documented yet (don't hesitate to make an issue for that too). Mostly I would recommend giving a quick look to the figures beyond the introduction.

SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.

Video demonstration (click the picture):

Papers implemented

Main metrics

Overview

Name With Owner	CorentinJ/Real-Time-Voice-Cloning
Primary Language	Python
Program language	Python (Language Count: 1)
Platform	Linux, Mac, Windows
License:	Other

所有者活动

Created At	2019-05-26 16:56:15
Pushed At	2025-09-23 15:21:53
Last Commit At	2025-09-23 15:21:53
Release Count	0

用户参与

Stargazers Count	58.7k
Watchers Count	0.9k
Fork Count	9.4k
Commits Count	299
Has Issues Enabled
Issues Count	1106
Issue Open Count	162
Pull Requests Count	49
Pull Requests Open Count	9
Pull Requests Close Count	88

项目设置

Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private