• Konkani Raw Speech Corpus
Konkani Raw Speech Corpus
  • Contributor: CIIL Mysore
  • Product Code: CIIL-KOK-RAW-Speech-123
Sample Download | size: 2.9MB | type: zip
Added on : 29 Jul 2019

156:37:51 hours of 100 Gigabytes speech data | 503 Speakers | 72,938 Audio segments | 48 kHz | 16 bit wav

Konkani belonging to the Indo-European family of languages. Konkani is the official language of Goa. However, the language is spoken widely across four states- Maharashtra, Goa, Karnataka and Kerala. Konkani is the only Indian language written in five different scripts - Devanagari, Roman, Kannada, Malayalam and Persian-Arabic. The LDC-IL speech data is collected from the regions of North Goa, South Goa, Karwar (Karnataka) and Sindhudurgh (Maharastra) from both the genders and different age group.

The LDC-IL Konkani Speech data set consists of different types of datasets that are made up of word lists, sentences running texts and date formats.


The available Speech Corpus details for Konkani are as follows.

Total of 504 speakers (267 Female and 237 Male)

    • Contemporary Text (News) - 477 Audio Segments 49:52:09 Hours
    • Created Text - 480 Audio Segments - 22:09:05 Hours
    • Sentence - 12050 Audio Segments - 15:51:11 Hours
    • Date Format - 953 Audio Segments - 01:50:39 Hours
    • Command and Control Words - 14944  Audio Segments - 16:11:02 Hours
    • Person Name - 9588 Audio Segments - 15:55:43 Hours
    • Place Name - 4812 Audio Segments - 05:31:03 Hours
    • Most Frequent Word-Part - 9104 Audio Segments - 7:22:57 Hours
    • Most Frequent Word-Full Set - 10987 Audio Segments - 9:53:28 Hours
    • Phonetically Balanced - 2975 Audio Segments - 02:49:36 Hours
    • Form and Function Word - 4285 Audio Segments - 04:29:03 Hours 
Speech Data Attributes
Annotation Raw Speech Corpus
Language Konkani
Duration 156:37:51
Speaker Type Native
File Size 100 GB
No. of Audio Segment 72938
Speaker Gender Male and Female

Write a review

Please login or register to review

Tags: Konkani, Raw Speech Corpus

Disclaimer: The information provided on this page has been procured through different sources. Please write back to us at nplt_support[at]cdac[dot]in in case you would like to suggest an update.