Real-Time Voice Cloning

5秒钟内克隆一个声音,实时生成任意语音。「Clone a voice in 5 seconds to generate arbitrary speech in real-time」

Github星跟蹤圖

Real-Time Voice Cloning

This repository is an implementation of Transfer Learning from Speaker Verification to
Multispeaker Text-To-Speech Synthesis
(SV2TTS) with a vocoder that works in real-time. Feel free to check my thesis if you're curious or if you're looking for info I haven't documented yet (don't hesitate to make an issue for that too). Mostly I would recommend giving a quick look to the figures beyond the introduction.

SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to-speech model trained to generalize to new voices.

Video demonstration (click the picture):

Toolbox demo

Papers implemented

主要指標

概覽
名稱與所有者CorentinJ/Real-Time-Voice-Cloning
主編程語言Python
編程語言Python (語言數: 1)
平台Linux, Mac, Windows
許可證Other
所有者活动
創建於2019-05-26 08:56:15
推送於2024-08-14 19:54:03
最后一次提交2024-05-29 14:04:38
發布數0
用户参与
星數54.1k
關注者數0.9k
派生數8.9k
提交數297
已啟用問題?
問題數1092
打開的問題數200
拉請求數48
打開的拉請求數22
關閉的拉請求數74
项目设置
已啟用Wiki?
已存檔?
是復刻?
已鎖定?
是鏡像?
是私有?