Clone a voice in 5 seconds to generate arbitrary speech in real-time
-
Updated
Nov 11, 2023 - Python
Clone a voice in 5 seconds to generate arbitrary speech in real-time
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
A Python/Pytorch app for easily synthesising human voices
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
An implementation of Tacotron 2 that supports multilingual experiments with parameter-sharing, code-switching, and voice cloning.
A webui for different audio related Neural Networks
PAddle PARAllel text-to-speech toolKIT (supporting Tacotron2, Transformer TTS, FastSpeech2/FastPitch, SpeedySpeech, WaveFlow and Parallel WaveGAN)
singing voice change based on whisper, and lora for singing voice clone
The code for the bark-voicecloning model. Training and inference.
Voice Conversion by CycleGAN (语音克隆/语音转换): CycleGAN-VC2
Wunjo AI: Synthesize & clone voices in English, Russian & Chinese, real-time speech recognition, deepfake face & lips animation, face swap with one photo, change video by text prompts, segmentation, and retouching. Open-source, local & free.
This repository has implementation for "Neural Voice Cloning With Few Samples"
Phoneme multilingual(Russian-English) voice cloning based on
Tacotron 2 - PyTorch implementation with faster-than-realtime inference modified to enable cross lingual voice cloning.
Implementation of Neural Voice Cloning with Few Samples Research Paper by Baidu
[WIP] VoiceSmith makes training text to speech models easy.
A guide to clone anyone's voice and use it as a text-to-speech with android
This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time. SV2TTS is a three-stage deep learning framework that allows to create a numerical representation of a voice from a few seconds of audio, and to use it to condition a text-to…
Add a description, image, and links to the voice-cloning topic page so that developers can more easily learn about it.
To associate your repository with the voice-cloning topic, visit your repo's landing page and select "manage topics."