23

GitHub - babysor/Realtime-Voice-Clone-Chinese: AI拟声: 克隆您的声音并生成任意语...

 3 years ago
source link: https://github.com/babysor/Realtime-Voice-Clone-Chinese
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

This repository is forked from Real-Time-Voice-Cloning which only support English.

English | 中文

Features

earth_africaChinese supported mandarin and tested with multiple datasets: aidatatang_200zh, SLR68

star_struckPyTorch worked for pytorch, tested in version of 1.9.0(latest in August 2021), with GPU Tesla T4 and GTX 2060

earth_africaWindows + Linux tested in both Windows OS and linux OS after fixing nits

star_struckEasy & Awesome effect with only newly-trained synthesizer, by reusing the pretrained encoder/vocoder

DEMO VIDEO

Quick Start

1. Install Requirements

Follow the original repo to test if you got all environment ready. **Python 3.7 or higher ** is needed to run the toolbox.

  • Install PyTorch.
  • Install ffmpeg.
  • Run pip install -r requirements.txt to install the remaining necessary packages.

2. Reuse the pretrained encoder/vocoder

Note that we need to specify the newly trained synthesizer model, since the original model is incompatible with the Chinese sympols. It means the demo_cli is not working at this moment.

3. Train synthesizer with aidatatang_200zh

  • Download aidatatang_200zh dataset and unzip: make sure you can access all .wav in train folder

  • Preprocess with the audios and the mel spectrograms: python synthesizer_preprocess_audio.py <datasets_root> Allow parameter --dataset {dataset} to support adatatang_200zh, SLR68

  • Preprocess the embeddings: python synthesizer_preprocess_embeds.py <datasets_root>/SV2TTS/synthesizer

  • Train the synthesizer: python synthesizer_train.py mandarin <datasets_root>/SV2TTS/synthesizer

  • Go to next step when you see attention line show and loss meet your need in training folder synthesizer/saved_models/.

FYI, my attention came after 18k steps and loss became lower than 0.4 after 50k steps.

4. Launch the Toolbox

You can then try the toolbox:

python demo_toolbox.py -d <datasets_root>
or
python demo_toolbox.py

  • Add demo video
  • Add support for more dataset
  • Upload pretrained model
  • pray Welcome to add more

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK