Real-Time Voice Cloning

5秒钟内克隆一个声音,实时生成任意语音。「Clone a voice in 5 seconds to generate arbitrary speech in real-time」

Github星跟踪图

Real-Time Voice Cloning

This repository is an implementation of Transfer Learning from Speaker Verification to
Multispeaker Text-To-Speech Synthesis
(SV2TTS) with a vocoder that works in real-time. Feel free to check my thesis if you're curious or if you're looking for info I haven't documented yet (don't hesitate to make an issue for that too). Mostly I would recommend giving a quick look to the figures beyond the introduction.

SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.

Video demonstration (click the picture):

Toolbox demo

Papers implemented

主要指标

概览
名称与所有者CorentinJ/Real-Time-Voice-Cloning
主编程语言Python
编程语言Python (语言数: 1)
平台Linux, Mac, Windows
许可证Other
所有者活动
创建于2019-05-26 08:56:15
推送于2024-08-14 19:54:03
最后一次提交2024-05-29 14:04:38
发布数0
用户参与
星数54.1k
关注者数0.9k
派生数8.9k
提交数297
已启用问题?
问题数1092
打开的问题数200
拉请求数48
打开的拉请求数22
关闭的拉请求数74
项目设置
已启用Wiki?
已存档?
是复刻?
已锁定?
是镜像?
是私有?