Real-Time Voice Cloning

5秒钟内克隆一个声音,实时生成任意语音。「Clone a voice in 5 seconds to generate arbitrary speech in real-time」

Github stars Tracking Chart

Real-Time Voice Cloning

This repository is an implementation of Transfer Learning from Speaker Verification to
Multispeaker Text-To-Speech Synthesis
(SV2TTS) with a vocoder that works in real-time. Feel free to check my thesis if you're curious or if you're looking for info I haven't documented yet (don't hesitate to make an issue for that too). Mostly I would recommend giving a quick look to the figures beyond the introduction.

SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.

Video demonstration (click the picture):

Toolbox demo

Papers implemented

Main metrics

Overview
Name With OwnerCorentinJ/Real-Time-Voice-Cloning
Primary LanguagePython
Program languagePython (Language Count: 1)
PlatformLinux, Mac, Windows
License:Other
所有者活动
Created At2019-05-26 08:56:15
Pushed At2024-08-14 19:54:03
Last Commit At2024-05-29 14:04:38
Release Count0
用户参与
Stargazers Count54.1k
Watchers Count0.9k
Fork Count8.9k
Commits Count297
Has Issues Enabled
Issues Count1092
Issue Open Count200
Pull Requests Count48
Pull Requests Open Count22
Pull Requests Close Count74
项目设置
Has Wiki Enabled
Is Archived
Is Fork
Is Locked
Is Mirror
Is Private