1 Star 0 Fork 49

思想的光芒 / DeepSparkHub

forked from DeepSpark / DeepSparkHub 
加入 Gitee
与超过 1200万 开发者一起发现、参与优秀开源项目,私有仓库也完全免费 :)
免费加入
克隆/下载
README.md 3.10 KB
一键复制 编辑 原始数据 按行查看 历史

PP-TTS-HiFiGAN

Model description

HiFiGAN is a commonly used vocoder in academia and industry in recent years, which can convert the frequency spectrum generated by acoustic models into high-quality audio. This vocoder uses generative adversarial networks as the basis for generating models.

Step 1:Installation

# Pip the requirements
pip3 install requirements.txt

# Clone the repo
git clone https://github.com/PaddlePaddle/PaddleSpeech.git
cd PaddleSpeech/examples/csmsc/voc5

Step 2:Preparing datasets

Download and Extract

Download CSMSC(BZNSYP) from this Website.and extract it to ./datasets. Then the dataset is in the directory ./datasets/BZNSYP.

Get MFA Result and Extract

We use MFA to get durations for fastspeech2. You can download from here baker_alignment_tone.tar.gz.

Put the data directory structure like this:

voc5
├── baker_alignment_tone
├── conf
├── datasets
│   └── BZNSYP
│       ├── PhoneLabeling
│       ├── ProsodyLabeling
│       └── Wave
├── local
└── ...

Change the rootdir of dataset in ./local/preprocess.sh to the dataset path. Like this: --rootdir=./datasets/BZNSYP

Data preprocessing

./run.sh --stage 0 --stop-stage 0

When it is done. A dump folder is created in the current directory. The structure of the dump folder is listed below.

dump
├── dev
│   ├── norm
│   └── raw
├── test
│   ├── norm
│   └── raw
└── train
    ├── norm
    ├── raw
    └── feats_stats.npy

Step 3:Training

Model Training

You can choose use how many gpus for training by changing gups parameter in run.sh file and ngpu parameter in ./local/train.sh file.

Modify ./local/train.sh file to use python3 run.

sed -i 's/python /python3 /g' ./local/train.sh

Full training may cost much time, you can modify the train_max_steps parameter in ./conf/default.yaml file to reduce training time. But in order to get the weight file you should make the train_max_steps parameter bigger than save_interval_steps parameter.

./run.sh --stage 1 --stop-stage 1

Synthesizing

Modify the parameter of ckpt_name in run.sh file to the weight name after training.

./run.sh --stage 2 --stop-stage 2

Results

Main results after 1000 step train.

GPUS avg_ips adversarial loss feature matching loss mel loss generator loss real loss fake loss discriminator loss
BI V100 × 1 15.42 sequences/sec 6.276 0.845 0.531 31.858 0.513 0.6289 1.142

Reference

HiFiGAN

马建仓 AI 助手
尝试更多
代码解读
代码找茬
代码优化
Python
1
https://gitee.com/Hcl11/deepsparkhub.git
git@gitee.com:Hcl11/deepsparkhub.git
Hcl11
deepsparkhub
DeepSparkHub
master

搜索帮助

344bd9b3 5694891 D2dac590 5694891