6

Some test samples for Text-To-Speech solutions

 2 years ago
source link: https://donghao.org/2022/06/09/some-test-samples-for-text-to-speech-solutions/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Some test samples for Text-To-Speech solutions

I am doing some research on TTS (Text-To-Speech) recently and noticed three almost state-of-the-art and also out-of-the-box solutions: LightSpeech (from Microsoft), FastSpeech2 (partly from Microsoft), Nemo (from Nvidia).

The testing text is a paragraph:

The Home Depot, Inc. is the world’s largest home improvement retailer based on net sales for fiscal 2021. We offer our customers a wide assortment of building materials, home improvement products, lawn and garden products, décor products, and facilities maintenance, repair and operations products and provide a number of services, including home improvement installation services and tool and equipment rental. As of the end of fiscal 2021, we operated 2,317 stores located throughout the U.S. (including the Commonwealth of Puerto Rico and the territories of the U.S. Virgin Islands and Guam), Canada, and Mexico. The Home Depot stores average approximately 104,000 square feet of enclosed space, with approximately 24,000 additional square feet of outside garden area. We also maintain a network of distribution and fulfillment centers, as well as a number of e-commerce websites in the U.S., Canada and Mexico. When we refer to “The Home Depot,” the “Company,” “we,” “us” or “our” in this report, we are referring to The Home Depot, Inc. and its consolidated subsidiaries.

The output of FastSpeech2:

it has a lot of noise and sounds like some type of metal.

The output of LightSpeech:

sounds a little better, more like human instead of robots

The output of Nemo:

this is the best result of all three solutions.

This test is just a summary of my research works and doesn’t mean which algorithm is better than others since the training process will heavily affect the final result. But at least, Nemo is the nearest one to the product scenario.

Related Posts

June 9, 2022 - 7:38 RobinDong machine learning
TTS
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Comment *

Name *

Email *

Website

Save my name, email, and website in this browser for the next time I comment.


Recommend

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK