NPLT Other Repositories

CIIL Mysore Repository

List of linguistic resources developed by Linguistic Data Consortium for Indian Languages (LDC-IL), CIIL Mysore.

**Repository Last Crawled Date: 26/08/2021

Sort By:

Show:

Manipuri Raw Speech Corpus

Sample Download | size: 0B | type: zip

Added on : 29 Jul 2019

Tags: Manipuri Raw Speech Corpus

Malayalam Raw Speech Corpus

164 hours; 43670 segments; 458 speakers Malayalam is the official language of Kerala and Laccadive Islands. It belongs to the Dravidian language ..

Sample Download | size: 0B | type: zip

Added on : 29 Jul 2019

Tags: Malayalam Raw Speech Corpus

Maithili Raw Speech Corpus

LDC-IL Maithili Raw speech data of 72:02:12 (hh:mm:ss) hours. The LDC-IL Maithili Speech data set consists of different typ..

Sample Download | size: 0B | type: zip

Added on : 29 Jul 2019

Tags: Maithili Raw Speech Corpus

Konkani Raw Speech Corpus

156:37:51 hours of 100 Gigabytes speech data | 503 Speakers | 72,938 Audio segments | 48 kHz | 16 bit wavKonkani belonging to the Indo-European family..

Sample Download | size: 0B | type: zip

Added on : 29 Jul 2019

Tags: Konkani Raw Speech Corpus

Kannada Raw Speech Corpus

179:32:52 hours of 115 Gigabytes speech data | 656 Speakers | 99109 Audio segments | 48 kHz | 16 bit wavKannada is one of the Ancient Indian languages..

Sample Download | size: 0B | type: zip

Added on : 29 Jul 2019

Tags: Kannada Raw Speech Corpus

Hindi Raw Speech Corpus

Hindi is a Major, Indo-Aryan language, a descendant of Sanskrit, which is spoken in the central and northern India.LDC-IL Hindi speech data of 11..

Sample Download | size: 0B | type: zip

Added on : 29 Jul 2019

Tags: Hindi Raw Speech Corpus

Bodo Raw Speech Corpus

176:53:28 hours of 113 Gigabytes speech data | 456 Speakers | 77443 Audio segments | 48 kHz | 16 bit wavBodo, one of the scheduled language of In..

Sample Download | size: 0B | type: zip

Added on : 29 Jul 2019

Tags: Boro Bodo Raw Speech Corpus

Bengali Raw Speech Corpus

Bengali is the official language of West Bengal and Tripura. It belongs to the Indo-Aryan language family.LDC-IL Bengali Speech Data set consists of d..

Sample Download | size: 0B | type: zip

Added on : 29 Jul 2019

Tags: Bengali Raw Speech Corpus

A Gold Standard Urdu Raw Text Corpus

Unicode Standard Urdu text corpus of 5161927 Words| 739 Titles | Data and Metadata in XML format | 5 Text domains.Urdu is one am..

Sample Download | size: 0B | type: zip

Added on : 26 Jul 2019

Tags: Urdu Raw Text Corpus

A Gold Standard Telugu Raw Text Corpus

Standard Telugu Text Corpus of 30,10,993 words|859 Titles|Data and Metadata in XML format | 6 Text Domains |Telugu Text Corpus encoded in a machine re..

Sample Download | size: 0B | type: zip

Added on : 26 Jul 2019

Tags: Telugu Raw Text Corpus

A Gold Standard Tamil Raw Text Corpus

Tamil is one of the longest-surviving Classical Languages in the world. It is a Dravidian Language Family.Tamil Text Corpus encoded in a machine reada..

Sample Download | size: 0B | type: zip

Added on : 26 Jul 2019

Tags: Tamil Raw Text Corpus

A Gold Standard Punjabi Raw Text Corpus

Punjabi Text Corpus encoded in a machine readable form and stored in a standard format. The major encoding being used is Unicode and stored in XM..

Sample Download | size: 0B | type: zip

Added on : 26 Jul 2019

Tags: Punjabi Raw Text Corpus

A Gold Standard Odia Raw Text Corpus

LDC-IL Odia Raw Text Corpus developed according to various factors such as quality of the text, representativeness, retrievable format, size of corpus..

Sample Download | size: 0B | type: zip

Added on : 26 Jul 2019

Tags: Odia Raw Text Corpus

A Gold Standard Nepali Raw Text Corpus

Nepali is one of the 22 schedule languages of India. It is descendent of Sanskrit.Nepali Text Corpus encoded in a machine readable form and stored in ..

Sample Download | size: 0B | type: zip

Added on : 26 Jul 2019

Tags: Nepali Raw Text Corpus

A Gold Standard Manipuri Raw Text Corpus

Manipuri Text Corpus is encoded in a machine readable form and stored in a standard format. The major encoding being used is Unicode and stored in XML..

Sample Download | size: 0B | type: zip

Added on : 26 Jul 2019

Tags: Manipuri Raw Text Corpus

Showing 16 to 30 of 41 (3 Pages)