Evaluation of Corpus Based TTS Built on Common Databases
Indian Language TTS Development Challenge, 2020
( hema@cse.iitm.ac.in, pranaw@cdac.in )
As a part of the National Translation Mission funded by MeitY, Govt of India, IIT Madras and CDAC Mumbai are jointly organising a challenge for the "Development of Text to Speech Synthesis (TTS) for Hindi and Tamil". It aims towards helping and encouraging the advancement of TTS in Indian Languages. The basic challenge is to take the released speech data, build TTS voices, and share the voice in web API form for evaluation. The output from each synthesizer is then evaluated through extensive listening tests.
Speech synthesis, also known as text-to-speech (TTS), has attracted increasingly more attention in recent times. Recent advances on speech synthesis has shown that TTS systems can produce very natural sounding speech from text. All around the world, TTS systems are built using various approaches. In India speech research community has grown significantly in the present time, and one can witness the current speech revolution. It is necessary to understand and compare the various research techniques used to build Indian Language TTS systems. Primary objective of this challenge is -- understanding and comparing the various approaches to build TTS and simultaneously identifying efficient speech groups in the country.
Good quality studio recorded speech data with very accurate transcript is required to build a high quality TTS system. However, when it comes to Indian languages, not everyone, especially academic institutions and startups, have access to these resources. As a part of this challenge, we will be releasing good quality training data for Hindi and Tamil. Everyone who participates in this challenge will then be free to use this data for research purposes.
About 5 hours of speech data in each of Hindi Male, Hindi Female, Tamil Male and Tamil Female, recorded by native professional speakers in high quality studio environments, and corresponding Text in UTF-8 format will be provided. No other information, such as segment labels, will be provided. Participants may build one voice for one or more subtask and submit for evaluation in web API form (as mentioned below). The subtasks are numbered as follows:
After building a system participants will have to put it on server and provide us url of GUI, where text can be synthesised. Organisers will use this url to prepare test data for listening test. This url may be public or private (only access to organiser).
It is not permissible for a single participant to submit multiple entries to any subtask (mentioned above), because the listening test may otherwise become unmanageable. This rule may be relaxed in the event of a small number of participants.
Along with the built system participant will have to submit a write up (one page or two page) about the entry mentioning approach, technology, data, challenge faced, features, observations, etc.
The organisers will conduct a DMOS (“degradation” or “differential” MOS test ) listening test to evaluate the submitted system.
Those who would perform well in this challenge may get the following opportunity:
Interested parties should register as soon as possible, by using the below link:
REGISTER NOWYou need to provide the following information in a form available at the above link:
There is no registration fee.
It is expected that a participating member must have some experience of building TTS in Indian Languages. After registration please send sample output of the synthesizer already built by your group (4 sentences), along with corresponding text to pranaw@cdac.in with cc to hema@cse.iitm.ac.in. Please mention the name of the language also.
Your registration would be confirmed based on evaluation of the submitted synthesizer output.
Date / Month | Event |
---|---|
27th July 2020 | Announcement of challenge |
15 August 2020 | Last date of registration |
As soon as registration is confirmed | Database release |
30th September 2020 | Submission of system for evaluation (by midnight PDT) |
October 2020 | Evaluation of system |
November 2020 | Release of results |
The license for the released data will be shared to the participants. Data will be released to each participant once the appropriate license has been agreed to.
Development tools, useful scripts, and other resources helpful in developing Indian language TTS systems are available at the below website:
https://www.iitm.ac.in/donlab/tts/
These may be helpful during development. This is just for reference, participants are free to use any tool or technologies for building voice.
This is a challenge, which is designed to answer scientific questions, and not a competition. Therefore, we rely on your honesty in preparing your entry.
For further information please contact pranaw@cdac.in with cc to hema@cse.iitm.ac.in.
Pranaw Kumar (Mob) : +91-7303226768