10
GitHub - NATSpeech/NATSpeech: A Non-Autoregressive Text-to-Speech (NAR-TTS) fram...
source link: https://github.com/NATSpeech/NATSpeech
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
NATSpeech: A Non-Autoregressive Text-to-Speech Framework
This repo contains official PyTorch implementation of:
Key Features
We implement the following features in this framework:
- Data processing for non-autoregressive Text-to-Speech using Montreal Forced Aligner.
- Convenient and scalable framework for training and inference.
- Simple but efficient random-access dataset implementation.
Install Dependencies
## We tested on Linux/Ubuntu 18.04. ## Install Python 3.6+ first (Anaconda recommended). export PYTHONPATH=. # build a virtual env (recommended). python -m venv venv source venv/bin/activate # install requirements. pip install -U pip pip install Cython numpy==1.19.1 pip install torch==1.9.0 # torch >= 1.9.0 recommended pip install -r requirements.txt sudo apt install -y sox libsox-fmt-mp3 bash mfa_usr/install_mfa.sh # install forced alignment tool
Documents
Citation
If you find this useful for your research, please cite the following papers:
- PortaSpeech
@article{ren2021portaspeech, title={PortaSpeech: Portable and High-Quality Generative Text-to-Speech}, author={Ren, Yi and Liu, Jinglin and Zhao, Zhou}, journal={Advances in Neural Information Processing Systems}, volume={34}, year={2021} }
- DiffSpeech
@article{liu2021diffsinger, title={Diffsinger: Singing voice synthesis via shallow diffusion mechanism}, author={Liu, Jinglin and Li, Chengxi and Ren, Yi and Chen, Feiyang and Liu, Peng and Zhao, Zhou}, journal={arXiv preprint arXiv:2105.02446}, volume={2}, year={2021} }
Acknowledgments
Our codes are influenced by the following repos:
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK