Leadership Positions Based on Organized NLP Challenges

Under ‘National Language Translation Mission’ project funded by MeitY, various institutes are organizing a series of challenges on Automatic Speech Recognition, Text to Speech, Machine Translation etc. Challenges/ Hackathons can be seen under Challenges Section. The results of these challenges are listed here as leaders of that particular technical challenge.


As a part of the National Language Translation Mission (funded by MeitY, Govt of India), IIT Madras and CDAC Mumbai jointly organised a challenge for the "Lip-Syncing in Speech-to-Speech Translation". The aim of the challenge was helping and encouraging the advancement of speech to speech translation of videos in Indian languages. The basic challenge was to take the input video (in English), create the output video in Hindi or Tamil and do lip-syncing on the output video. For more detail, please visit Lip-Sync Challenge, 2021

Lip-Sync Challenge, 2021

Tamil Task 1

Team Name Lip-Sync Quality Fluency Consistency Semantic Consistency Over All User Experience Final Score
TeamCSRL (CS RESEARCH LABS) 3.85 4.03 4.14 4.16 4.039
Baseline_system (IIT Madras) 3.09 2.91 2.90 3.16 3.01

Hindi Task 1

Team Name Lip-Sync Quality Fluency Consistency Semantic Consistency Over All User Experience Final Score
CNLP-NITS (NIT Silchar) 3.86 3.63 3.94 3.94 3.84
Baseline_system (IIT Madras) 3.49 3.52 3.87 3.83 3.68
TeamCSRL (CS RESEARCH LABS) 3.08 3.32 3.92 3.51 3.46

Hindi Task 2

Team Name Lip-Sync Quality Fluency Consistency Semantic Consistency Over All User Experience Final Score
CNLP-NITS (NIT Silchar) 3.37 3.29 3.4 3.38 3.36

Speech Lab, IIT Madras has conducted three ASR challenges namely1)Hindi ASR Challenge, 2)English ASR Challenge and 3) Indian Language (English, Hindi, Tamil) ASR Challenge at different time interval in period of July 2020 - July 2021. These challenge had received overwhelming response from Industry and academia. Further details about these challenges can be found here. After an extensive evaluation of all submissions received, below is the final leadership board of ASR Challenges.

English, Hindi and Tamil ASR LeaderBoard

English Evaluation Set

Open Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 Sayint_Techm 4.78 4 Wav2Vec2 kenlm
2 Armsoftech.air_b 4.91 3 Wav2vec2 (XLSR-53) with 6gram LM
3 TCOE_IITMRP 5.02 6 KALDI + DNN + noise+ 6 gram LM
4 SIPLAB_IITH 5.28 3 wav2vec2 xlsr wit lm
5 Ekstep 6.02 4 Wav2vec2 with LM
6 Armsoftech.air_a 7.22 2 kaldi chain 5 gram
7 Dheeyantra 13.72 1 Kaldi Chain

Closed Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 Sayint_Techm 4.51 4 Kaldi Chain Model
2 ClusterDev_Desh_Keyboards 4.64 4 Kaldi Chain with RNNLM
3 CDAC_Mumbai 4.78 10 kaldi chain TDNN with 5 gram lm
4 yellow.ai 4.78 1 kaldi tdnn flatstart, 4gram kenlm
5 Speech_Lab_IIIT_Hyderabad 4.79 1 Basic_Chain_V0_Kaldi
6 JHU 4.83 4 CNN+TDNNF LF-MMI + 7gram LM + Full Train text + RNNLM rescoring
7 SIPLAB_IITH 4.96 5 Kaldi Chain Model + 5gram lm
8 Armsoftech.air_a 4.97 1 kaldi chain RNNLM
9 TCOE_IITMRP 5.61 9 KALDI + DNN 5 gram LM
10 Kaizen_Secure_Voiz 5.69 10 Kaldi Chain + RNNLM(Tuned)
11 Uniphore 5.74 1 Kaldi model-Used MFCC for features, IRSTLM, Mono, Tri1, tri2, tri3, Chain Model
12 IIIT_Dharwad 5.82 1 kaldi chain
13 SMT_Lab_IIT_Madras 5.89 8 TDNN - Kaldi Model
14 Ekstep 7.05 3 wav2vec2 without LM

Hindi Evaluation Set

Open Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 TCOE_IITMRP 3.11 6 Kaldi TDNN 20gm LM
2 CDAC_Mumbai 3.42 10 kaldi chain TDNN with 8 gram lm
3 Sayint_Techm 3.44 5 Wav2Vec2 kenlm
4 SIPLAB_IITH 3.57 4 Pretrained wav2vec2-xlsr model with 5gram lm
5 Ekstep 3.78 5 Wav2ve2 with LM
6 Dheeyantra 4.03 5 Kaldi Chain + RNNLM
7 Armsoftech.air_b 4.55 3 Wav2vec2 (XLSR-53) with 5gram LM
8 SMT_Lab_IIT_Madras 7.05 1 E2E system

Closed Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 TCOE_IITMRP 3.11 8 Kaldi TDNN 20gm LM
2 JHU 3.15 3 CNN+TDNNF LF-MMI + 7gram LM + Full Train text + RNNLM rescoring
3 CDAC_Mumbai 3.4 10 kaldi chain TDNN with 8 gram lm
4 ClusterDev_Desh_Keyboards 3.43 7 Kaldi Chain + 7gram LM + RNNLM
5 Sayint_Techm 3.47 5 Kaldi Chain Model
6 SIPLAB_IITH 3.68 3 Kaldi Chain Model
7 SMT_Lab_IIT_Madras 3.69 10 E2E system
8 Armsoftech.air_a 3.73 1 Kaldi Chain 5 Gram SRILM
9 IIIT_Dharwad 3.97 1 kaldi chain
10 Speech_Lab_IIIT_Hyderabad 4.39 2 Basic_Chain_V0_Kaldi
11 Uniphore 4.95 1 Kaldi model-Used MFCC for features, IRSTLM, Mono, Tri1, tri2, tri3, Chain Model
12 Kaizen_Secure_Voiz 5.85 9 Kaldi Chain + RNNLM(Tuned)
13 Ekstep 6.16 2 Wav2vec2 with 5 gram LM
14 CDAC_Pune_a 6.46 2 nnet3

Tamil Evaluation Set

Open Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 Sayint_Techm 4.93 5 Kaldi Chain Model + 4 gram LM
2 ClusterDev_Desh_Keyboards 5.3 3 Kaldi Chain with RNNLM ( ext data )
3 SIPLAB_IITH 5.8 1 Pretrained wav2vec2-xlsr model with 5gram lm
4 Armsoftech.air_b 5.84 4 Wav2vec2 (XLSR-53) with 4gram kenLM
5 Ekstep 6.05 5 Wav2vec2 with LM
6 TCOE_IITMRP 6.16 5 Kaldi + DNN 7 gram LM with noise
7 Dheeyantra 6.69 11 Kaldi Chain + RNNLM
8 SMT_Lab_IIT_Madras 12.54 1 E2E system

Closed Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 Sayint_Techm 5.13 4 Kaldi Chain Model
2 JHU 5.2 2 CNN+TDNNF LF-MMI + 3 gram LM + Full Train text
3 Armsoftech.air_a 5.21 2 Kaldi chain_5gramLM
4 IIIT_Dharwad 5.21 1 kaldi chain
5 ClusterDev_Desh_Keyboards 5.41 3 Kaldi Chain + 7gram LM + RNNLM
6 Speech_Lab_IIIT_Hyderabad 5.64 1 Basic_Chain_V0_Kaldi
7 SMT_Lab_IIT_Madras 5.69 6 TDNN - Kaldi Model tri3
8 TCOE_IITMRP 5.73 9 Kaldi + DNN 4 gram SRILM
9 CDAC_Mumbai 5.86 10 kaldi chain TDNN with 9 gram lm
10 SIPLAB_IITH 6.13 3 Kaldi Chain Model
11 Kaizen_Secure_Voiz 6.43 6 Kaldi Chain + LM(Tuned)
12 Ekstep 6.45 3 Wav2vec2
13 CDAC_Pune_a 7.13 1 nnet3
14 Uniphore 8.31 1 Kaldi 3 gram model

English ASR LeaderBoard

IITM Evaluation Set

Open Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 BUT 4.98 4 Mult_v5_graph.my_v0
2 Scribetech 5.25 6 Kaldi Chain + Transfer Learning + No Punctuations + RNNLM
3 IITB_a 5.27 1 chain model fine tuned on dev with RNNLM
4 Reliiance_Jio_AICOE 5.57 2 TDNN chain + 4gram LM
5 CDOT 5.64 5 extra data for language model 4 gram, lmwt adjusted
6 IIT_Hyderabad 5.71 2 Kaldi_Chain with CPC_features extracted from pretrained models
7 NITAP_Cognizyr 6.15 4 transfer learning from TDNN indian languages model with 62 phonemes applied on language model
8 Ekstep_Thoughtworks 6.79 7 Wav2vec2 Pretrained on English Training Data + Extra data + Finetuned on the training data + 5 gram LM from training and extra data.

Closed Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 CDAC_Pune_b 5.27 1 Kaldi Chain
2 IITB_a 5.33 1 chain model with RNNLM
3 Samsung_R_and_D_Bangalore 5.39 8 chain model + ivector + MoE Layer + n-gram LM
4 BUT 5.41 2 Mono.graph.my_v1
5 IIT_Hyderabad 5.57 1 Kaldi_Chain + RNNLM
6 CDOT 5.58 3 check_baseline
7 Reliiance_Jio_AICOE 5.64 3 TDNN chain + RNNLM
8 Sayint_Zen3_Info_Solutions 5.69 6 Kaldi Chain Subword Model
9 IITB_b 6.1 3 Kaldi Chain Model
10 Armsoftech_a 6.13 6 Kaldi Chain model with 4 gram SRILM
11 IIIT_Dharwad 6.18 6 RNNLM
12 Ekstep_Thoughtworks 6.73 8 Wav2vec2 Pretrained on English Training Data + Finetuned on the training data + 5 gram LM on training data
13 IIT_GUWAHATI 8.31 1 DNN with 4gram
14 Armsoftech_b 18.46 1 Deepspeech with KenLM and Deepspeech default data augmentation

NPTEL Evaluation Set

Open Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 Ekstep_Thoughtworks 5.84 3 Wav2vec2 Pretrained on English Training Data + Extra data + Finetuned on the training data + 5 gram LM from training and extra data
2 BUT 9.78 7 TDNN_V2
3 Scribetech 10.12 6 Kaldi Chain + Transfer Learning + RNNLM (Train+Dev)
4 IIT_Hyderabad 10.85 2 Kaldi_Chain with CPC_features extracted from pretrained models + RNNLM
5 CDOT 11.6 5 bigger lm, baseline am , 4 gram model
6 NITAP_Cognizyr 13.33 10 Transfer Learning from 62 phones with extended vocabulary related to domain
7 Reliiance_Jio_AICOE 13.4 3 TDNN chain + 4gram LM
8 CDAC_Pune_a 15.76 3 nnet3

Closed Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 Ekstep_Thoughtworks 5.79 5 Wav2vec2 Pretrained on English Training Data + Finetuned on the training data + NO LM Present.
2 Samsung_R_and_D_Bangalore 8.39 5 Espnet transformer with LSTM LM, Data augmentation using speed and volume perturbations
3 Sayint_Zen3_Info_Solutions 8.97 4 Kaldi Chain Subword Model
4 IIT_Hyderabad 10.32 1 Kaldi_Chain+ RNNLM
5 BUT 10.33 1 Mono.graph.my_v1.rnnlm
6 Armsoftech_a 11.27 5 Kaldi Chain model with RNNLM
7 IIIT_Dharwad 11.27 7 RNNLM
8 CDAC_Pune_b 11.46 5 Kaldi chain
9 CDOT 11.48 2 basline am,baseline nptel lm
10 IITB_b 11.89 4 Kaldi Chain model with RNNLM rescoring
11 Reliiance_Jio_AICOE 13.54 1 TDNN chain + 4gram LM
12 Armsoftech_b 17.61 2 Deepspeech with KenLM and Deepspeech default data augmentation
13 IIT_GUWAHATI 58.37 2 DNN with 4gram

Hindi ASR LeaderBoard

Open Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 IITH 7.04 6 Kaldi Chain Model + Lattice combination of 4-gram lattices
2 IITB-a 7.12 1 Kaldi Chain Model + LM(ext text)
3 SRIB 7.14 2 Kaldi Chain Model + 4-gram LM
4 Vernacular.ai 7.15 8 Kaldi Chain Model + RNNLM (ext data)
5 Scribetech 7.76 6 Kaldi Chain - Pretrained AM + LM(Ext data)
6 CDAC-Pune 8.35 2 Kaldi Chain Model - Fine tuned AM/LM with train/dev
7 IITB-b 9.48 3 Model fine tuned on dev
8 IRL-Bangalore 10.27 2 E2E Model

Closed Task

Position Team name Best Score (WER %) # Submissions Best Approach
1 ZAPR Media Labs 6.42 7 Kaldi Chain Model, 5-gram SRILM
2 SRIB 7.08 7 Kaldi Chain TDNN Fusion Model + 4gram LM
3 IITB-a 7.12 2 Kaldi Chain Model + RNNLM
4 IITH 7.33 3 Kaldi Chain Model + RNNLM
5 IITB-b 7.47 3 Kaldi Chain Model + RNNLM
6 Vernacular.ai 7.55 1 Kaldi Chain Model + 4gram LM
7 Armsoftech.air 7.71 5 Kaldi Chain Model 4 gram SRILM
8 CDAC-Pune 8.4 1 Kaldi Chain Model
9 IITG 10.26 1 MFCC + Pitch, Kaldi DNN-HMM AM + 4gram LM
10 Zen3 Info Solutions 11.95 1 Kaldi TDNN E2E + SentencePiece Lexicon
11 IRL-Bangalore 13.84 1 E2E system
12 NIT-Kurukshetra 37.81 4 ESPnet Hybrid CTC/attention E2E (BLSTM) ASR
13 IISc 42.22 2 E2E: wav2letter

IIT Madras and C-DAC Mumbai had jointly organized a challenge for the "Development of Text to Speech Synthesis (TTS) for Hindi and Tamil" , during Aug 2020 - Nov 2020. This challenge aims towards helping and encouraging the advancement of TTS in Indian Languages. The basic challenge was to take the released speech data, build TTS voices, and share the voice in web API form for evaluation. This challenge has received overwhelming response from Industry and academia. Further details about this challenge can be found here. After an extensive evaluation of all submissions received, below is the leadership board of Hindi TTS and Tamil TTS.

TTS Leaderboard

Naturalness(DMOS) result of Hindi Male Voices

S.N. Company Name VoiceCode DMOS
1 ZAPR Media Labs B_Hindi_Male 4.215236136
2 Deterministic Algorithms Lab F_Hindi_Male 4.091949113
3 Indian TTS C_Hindi_Male 4.048888889
4 Reverie Language Technologies A_Hindi_Male 3.948072598
5 Sayint (Zen3 Tech) D_Hindi_Male 3.15380117

Intelligibility (WER) result of Hindi Male Voices

S.N. Company Name VoiceCode WER
1 ZAPR Media Labs B_Hindi_Male 0.04741883
2 Sayint (Zen3 Tech) D_Hindi_Male 0.04741883
3 Reverie Language Technologies A_Hindi_Male 0.088952349
4 Indian TTS C_Hindi_Male 0.121408819
5 Deterministic Algorithms Lab F_Hindi_Male 0.20003522

Naturalness(DMOS) result of Hindi Female Voices

S.N. Company Name Voice Code DMOS
1 Reverie Language Technologies A_Hindi_Female 4.4733652889
2 Deterministic Algorithms Lab F_Hindi_Female 3.8318062118
3 Indian TTS C_Hindi_Female 3.4735184618
4 Sayint (Zen3 Tech) D_Hindi_Female 3.2522941747

Intelligibility (WER) result of Hindi Female Voices

S.N. Company Name VoiceCode WER
1 Reverie Language Technologies A_Hindi_Female 0.0614024416
2 Indian TTS C_Hindi_Female 0.096319142
3 Deterministic Algorithms Lab F_Hindi_Female 0.1830890904
4 Sayint (Zen3 Tech) D_Hindi_Female 0.1861408978

Naturalness(DMOS) result of Tamil Female Voices

S.N. Company Name Voice Code DMOS
1 Reverie Language Technologies A_Tamil_Female 4.8941383517
2 IIT Hyderabad E_Tamil_Female 3.8626112448
3 Sayint (Zen3 Tech) D_Tamil_Female 3.6411514453
4 Indian TTS C_Tamil_Female 3.2599848655
5 Deterministic Algorithms Lab F_Tamil_Female 2.8766188515

Intelligibility (WER) result of Tamil Female Voices

S.N. Company Name VoiceCode WER
1 Sayint (Zen3 Tech) D_Tamil_Female 0.0814777328
2 Reverie Language Technologies A_Tamil_Female 0.1251312666
3 IIT Hyderabad E_Tamil_Female 0.217042004
4 Indian TTS C_Tamil_Female 0.2271396761
5 Deterministic Algorithms Lab F_Tamil_Female 0.5663039811

Naturalness(DMOS) result of Tamil Male Voices

S.N. Company Name Voice Code DMOS
1 Karunya Institute of Technology and Sciences G_Tamil_Male 4.330030958
2 Reverie Language Technologies A_Tamil_Male 3.812270705
3 Deterministic Algorithms Lab F_Tamil_Male 3.27754386

Intelligibility (WER) result of Tamil Male Voices

S.N. Company Name VoiceCode WER
1 Karunya Institute of Technology and Sciences G_Tamil_Male 0.080984968
2 Deterministic Algorithms Lab F_Tamil_Male 0.091819613
3 Reverie Language Technologies A_Tamil_Male 0.180935054

CFILT, IIT Bombay, had organized a challenge for the "English-Marathi Parallel Corpus Creation for Machine Translation (MPaCT)" challenge. This challenge aims towards helping and encouraging the advancement of Machine Translation technology in Indian Languages. Many startups has shown intrest and participatted in the challenge. After an extensive evaluation of all submissions received, below is the leadership board of MPaCT challenge.

MPaCT Leaderboard

Position Vendor ID Name Organization
1 V6 Saurabh Bhunje Techliebe
2 V2 Gajanan Rane Shri Samarth Krupa Language Solutions
3 V7 Sowmya A DesiCrew Solutions Pvt. Ltd.
4 V4 Manoj Gupta Zibanka Media Services Pvt. Ltd.
5 V1 Bhuvana Krishnamoorthy Bow and Baan
6 V3 Himanshu Sharma Devnagri
7 V5 Shankar Prasad Megdap Innovation Labs Pvt. Ltd
8 V8 Vinod Rathi Crescendo Transcription Pvt. Ltd