Some test samples for Text-To-Speech solutions
source link: https://donghao.org/2022/06/09/some-test-samples-for-text-to-speech-solutions/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Some test samples for Text-To-Speech solutions
I am doing some research on TTS (Text-To-Speech) recently and noticed three almost state-of-the-art and also out-of-the-box solutions: LightSpeech (from Microsoft), FastSpeech2 (partly from Microsoft), Nemo (from Nvidia).
The testing text is a paragraph:
The Home Depot, Inc. is the world’s largest home improvement retailer based on net sales for fiscal 2021. We offer our customers a wide assortment of building materials, home improvement products, lawn and garden products, décor products, and facilities maintenance, repair and operations products and provide a number of services, including home improvement installation services and tool and equipment rental. As of the end of fiscal 2021, we operated 2,317 stores located throughout the U.S. (including the Commonwealth of Puerto Rico and the territories of the U.S. Virgin Islands and Guam), Canada, and Mexico. The Home Depot stores average approximately 104,000 square feet of enclosed space, with approximately 24,000 additional square feet of outside garden area. We also maintain a network of distribution and fulfillment centers, as well as a number of e-commerce websites in the U.S., Canada and Mexico. When we refer to “The Home Depot,” the “Company,” “we,” “us” or “our” in this report, we are referring to The Home Depot, Inc. and its consolidated subsidiaries.
The output of FastSpeech2:
it has a lot of noise and sounds like some type of metal.
The output of LightSpeech:
sounds a little better, more like human instead of robots
The output of Nemo:
this is the best result of all three solutions.
This test is just a summary of my research works and doesn’t mean which algorithm is better than others since the training process will heavily affect the final result. But at least, Nemo is the nearest one to the product scenario.
Related Posts
- Investigating about Streaming ETL solutions
Normal ETL solutions need to deliver all data from transactional databases to data warehouse. For…
- Write text to file with disabling buffer in Python3
In Python2 era, we could use these code to write the file without buffer: file…
- Performance test for unikernels (Rumpkernel and OSv)
Unikernels are specialised, single-address-space machine images constructed by using library operating systems. The concept of…
June 9, 2022 - 7:38
RobinDong
machine learning
TTS
Leave a comment
Leave a Reply Cancel reply
Your email address will not be published. Required fields are marked *
Comment *
Name *
Email *
Website
Save my name, email, and website in this browser for the next time I comment.
Recommend
-
63
Having interacted with several apps over the years, there is a very high chance that you have interacted with apps that provide some form of voice experience. It could be an app with text-to-speech functionality like read...
-
30
Serving Intelligent APIs with Google Cloud Run
-
28
Photo by
-
10
YouTube Workshop video The whole 40min workshop is available below, which talks...
-
11
How to Add Language Translation and Text to Speech Audio Using AWS Amplify📅 February 14, 2020 – Kyle GalbraithOutside of tech and software development, a passion of mine for the past five years has been learning French. I have...
-
10
How to Transcribe Speech Recordings into Text with PythonWhen you have a recording where one or more people are talking, it's useful to have a highly accurate and automated way to extract the spoken words into text. Once you have the text, yo...
-
5
Overview Introduction This tutorials demonstrates how to use Python for text-to-speech using a cross-platform library, pyttsx3. This lets you synthesize text in to audio you can hear. This package works in Windo...
-
4
[JavaScript] Test Text-to-Speech in Web Speech API April 22, 2017
-
10
This Healthtech Is Collecting Test Samples From Your Doorstep Using Flebo.in, the customer can compare diagnostics labs and choose the best one based on reputation, price, turnaround time and proximity
-
6
April 22, 2023 at 07:46 Tags WebAssembly ,
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK