•    Freeware
  •    Shareware
  •    Research
  •    Localization Tools 20
  •    Publications 721
  •    Validators 2
  •    Mobile Apps 22
  •    Fonts 31
  •    Guidelines/ Draft Standards 3
  •    Documents 13
  •    General Tools 38
  •    NLP Tools 105
  •    Linguistic Resources 265

Search Results | Total Results found :   1220

You refine search by : All Results
  Catalogue
Research in Automatic Speech Recognition (ASR) has witnessed a steep improvement in the past decade (especially for English language) where the variety and amount of training data available is huge. In this work, we develop an ASR and Keyword Search (KWS) system for Manipuri, a low-resource Indian Language. Manipuri (also known as Meitei), is a Tibeto-Burman language spoken predominantly in Manipur (a northeastern state of India). We collect and transcribe telephonic read speech data of 90+ hours from 300+ speakers for the ASR task. Both state-of-the-art Gaussian Mixture-Hidden Markov Model (GMM-HMM) and Deep Neural NetworkHidden Markov Model (DNN-HMM) based architectures are developed as a baseline. Using the collected data, we achieve better performance using DNN-HMM systems, i.e., 13.57% WER for ASR and 7.64% EER for KWS.

Added on January 23, 2020

22

  More Details
  • Contributed by : Individual
  • Product Type : Research Paper
  • License Type : Freeware
  • System Requirement : Not Applicable
  • Author : Tanvina Patel ,Krishna DN, Noor Fathima, Nisar Shah, Mahima C,Deepak Kumar,Anuroop Iyengar

Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as source language and translated it in Tamil as the target language. There are 70,000 sentences, including Health, Tourism, Agriculture and Entertainment domain in this corpus. This corpus has a unique sentence ID for each sentence, UTF-8 encoding, and text file format. The translated sentences have been POS tagged and Chunked properly. The chunking guideline used in this corpus creation, is provided in supporting document.

Added on December 27, 2019

0
11

  More Details
  • Contributed by : ILCI Consortium, JNU
  • Product Type : Text Corpora
  • License Type : Research
  • System Requirement : Not Applicable

Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as source language and translated it in Punjabi as the target language. There are 70,000 sentences, including Health, Tourism, Agriculture and Entertainment domain in this corpus. This corpus has a unique sentence ID for each sentence, UTF-8 encoding, and text file format. The translated sentences have been POS tagged and Chunked properly. The chunking guideline used in this corpus creation, is provided in supporting document.

Added on December 27, 2019

0
3

  More Details
  • Contributed by : ILCI Consortium, JNU
  • Product Type : Text Corpora
  • License Type : Research
  • System Requirement : Not Applicable

Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as source language and translated it in Nepali as the target language. There are 70,000 sentences, including Health, Tourism, Agriculture and Entertainment domain in this corpus. This corpus has a unique sentence ID for each sentence, UTF-8 encoding, and text file format. The translated sentences have been POS tagged and Chunked properly. The chunking guideline used in this corpus creation, is provided in supporting document.

Added on December 27, 2019

0
4

  More Details
  • Contributed by : ILCI Consortium, JNU
  • Product Type : Text Corpora
  • License Type : Research
  • System Requirement : Not Applicable

Under the Indian Languages Corpora Initiative (ILCI) project initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as source language and translated it in Marathi as the target language. There are 70,000 sentences, including Health, Tourism, Agriculture and Entertainment domain in this corpus. This corpus has a unique sentence ID for each sentence, UTF-8 encoding, and text file format. The translated sentences have been POS tagged and Chunked properly. The chunking guideline used in this corpus creation, is provided in supporting document.

Added on December 27, 2019

0
7

  More Details
  • Contributed by : ILCI Consortium, JNU
  • Product Type : Text Corpora
  • License Type : Research
  • System Requirement : Not Applicable