Resources

Here the term Resources refers to a set of speech or language data and descriptions in machine readable form, for the purpose of building, improving or evaluating natural language and speech algorithms or systems.

Refine Search


Kashmiri Raw Speech Corpus

Kashmiri Raw Speech Corpus

Dataset Description 28:10:07 Hours | 18 GB speech data | 150 Speakers | 16,380 Audio segments | 48 kHz | 16 bit wa..

Sample Download | size: 1.6MB | type: zip

Added on : 26 Aug 2021

Gujarati Raw Speech Corpus(Mono Recordings)

Gujarati Raw Speech Corpus(Mono Recordings)

Dataset Description 64:44:02 Hours | 7.1 GB | 233 Speakers| 26,223 Audio Segments | 16 kHz | 16 bit wav. Gujarati is one of ..

Sample Download | size: 380.7KB | type: zip

Added on : 26 Aug 2021

Gujarati Raw Speech Corpus

Gujarati Raw Speech Corpus

Dataset Description57:17:08 Hours | 37 GB | 204 Speakers| 25,712 Audio Segments | 48 kHz | 16 bit wav. Gujarati is one of the ma..

Sample Download | size: 2.3MB | type: zip

Added on : 26 Aug 2021

Dogri Raw Speech Corpus

Dogri Raw Speech Corpus

Dataset Description 17:10:26 Hours | 11 GB speech data | 61 Speakers | 12,036 Audio segments | 48 kHz | 16..

Sample Download | size: 2MB | type: zip

Added on : 26 Aug 2021

Assamese Raw Speech Corpus

Assamese Raw Speech Corpus

Dataset Description  54:21:12 Hours | 32.5 GB | 304 Speakers | 37,570 Audio Segments | 48 kHz | 16 bit wav.&n..

Sample Download | size: 1.3MB | type: zip

Added on : 26 Aug 2021

Tamil ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

Tamil ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of Tamil read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Spe..

Available Under License:
Research  

Added on : 26 Jul 2021

Indian English ASR Challenge Data (ASR Speech Data) - NLTM Pilot

Indian English ASR Challenge Data (ASR Speech Data) - NLTM Pilot

The data set comprises of Indian English read speech and lecture speech data along with the corresponding transcriptions. The read speech covers genre..

Available Under License:
Research  

Sample Download | size: 23.7MB | type: tar

Added on : 10 Jun 2021

English-Urdu Tourism Set - II Parallel Text corpus-EILMT

English-Urdu Tourism Set - II Parallel Text corpus-EILMT

English-Urdu Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core vo..

Available Under License:
Commercial   Research  

Sample Download | size: 11.7KB | type: zip

Added on : 20 Aug 2020

English-Urdu Tourism Set - I Parallel Text corpus-EILMT

English-Urdu Tourism Set - I Parallel Text corpus-EILMT

English-Urdu Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core vo..

Available Under License:
Commercial   Research  

Sample Download | size: 23.2KB | type: zip

Added on : 20 Aug 2020

English-Urdu Health Parallel Text corpus-EILMT

English-Urdu Health Parallel Text corpus-EILMT

English-Urdu Parallel Health Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This corpus..

Available Under License:
Commercial   Research  

Sample Download | size: 28.4KB | type: zip

Added on : 20 Aug 2020

English-Urdu Agriculture Parallel Text corpus-EILMT

English-Urdu Agriculture Parallel Text corpus-EILMT

English-Urdu Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This c..

Available Under License:
Commercial   Research  

Sample Download | size: 514.4KB | type: zip

Added on : 20 Aug 2020

English-Tamil Tourism Set - II Parallel Text corpus-EILMT

English-Tamil Tourism Set - II Parallel Text corpus-EILMT

English-Tamil Parallel Tourism Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) consortium. The core v..

Available Under License:
Commercial   Research  

Sample Download | size: 24.1KB | type: zip

Added on : 17 Aug 2020

English-Tamil Health Parallel Text corpus-EILMT

English-Tamil Health Parallel Text corpus-EILMT

English-Tamil Parallel Health Text corpus is developed in Unicode under English to Indian Language Machine Translation (EILMT) Consortium. This corpus..

Available Under License:
Commercial   Research  

Sample Download | size: 23.3KB | type: zip

Added on : 17 Aug 2020

English-Tamil Agriculture Parallel Text corpus-EILMT

English-Tamil Agriculture Parallel Text corpus-EILMT

English-Tamil Parallel Agriculture Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) Consortium. This ..

Available Under License:
Commercial   Research  

Sample Download | size: 32.7KB | type: zip

Added on : 17 Aug 2020

English-Odia Tourism Set - II Parallel Text corpus-EILMT

English-Odia Tourism Set - II Parallel Text corpus-EILMT

English-Odia Parallel Tourism Text corpus is developed in Unicode, under English to Indian Language Machine Translation (EILMT) consortium. The core v..

Available Under License:
Commercial   Research  

Sample Download | size: 25.9KB | type: zip

Added on : 04 Aug 2020

Showing 16 to 30 of 141 (10 Pages)
Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.