Real-Time Voice Cloning

5秒钟内克隆一个声音,实时生成任意语音。「Clone a voice in 5 seconds to generate arbitrary speech in real-time」

Github星跟踪图

Real-Time Voice Cloning

This repository is an implementation of Transfer Learning from Speaker Verification to
Multispeaker Text-To-Speech Synthesis
(SV2TTS) with a vocoder that works in real-time. Feel free to check my thesis if you're curious or if you're looking for info I haven't documented yet (don't hesitate to make an issue for that too). Mostly I would recommend giving a quick look to the figures beyond the introduction.

SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.

Video demonstration (click the picture):

Toolbox demo

Papers implemented

主要指标

概览
名称与所有者CorentinJ/Real-Time-Voice-Cloning
主编程语言Python
编程语言Python (语言数: 1)
平台Linux, Mac, Windows
许可证Other
所有者活动
创建于2019-05-26 08:56:15
推送于2025-05-30 11:41:05
最后一次提交2025-05-30 13:41:05
发布数0
用户参与
星数54.5k
关注者数0.9k
派生数9k
提交数298
已启用问题?
问题数1096
打开的问题数202
拉请求数48
打开的拉请求数21
关闭的拉请求数74
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?