National Platform for Language Technology
  • Skip to Main Content
  • Announcement 1
  • Sign up
    • Register
    • Login
  • Save for later (0)
  • Feedback
    • Your cart is empty!

Highlights / Announcement

New Services Added on Portal
  • About
    • NLTM
    • NPLT
    • NLTM Advisors
    • NLTM Consortium
  • Resources
    • Text Corpus
    • Tools
    • Speech Corpus
    • WordNet
    • Treebank
    • PLS
    • Other Repositories
    • By Private Players
    Show All Resources
  • Services
    • Machine Translation
    • Speech Recognizer
    • Text to Speech
    • Transliteration
    • OCR
    • Govt. Services
    • Startups Services
    • Third Party Services
    Show All Services
  • Demonstration
  • Startups
    • Startup Wall
    • Mentor Wall
  • LeaderBoard
  • Dashboard
  • Marketplace
    • Data Marketplace
    • Translation Marketplace
Localization Logo
TDIL
Meity Startup
Startup Wall
Dashboard
C-DAC : Transliteration
  • Search

Search

Products meeting the search criteria

Product Compare (0)
NLTM Pilot TTS Data for Indian Languages — Hindi, Punjabi, Tamil, and Indian English.

NLTM Pilot TTS Data for Indian Languages — Hindi, Punjabi, Tamil, and Indian English.

TTS data for Indian languages — Hindi, Punjabi, Tamil, and Indian English. Text and corresponding speech data record in studio environment....

Contributor:  TTS Consortia
Tags:  TTS Data,Speech Data, Hindi TTS Data, Punjabi TTS Data, Tamil TTS Data, Indian English TTS Data, IITM
Redirect to external website
click here
Hindi ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

Hindi ASR Challenge Data (ASR Speech Data released under 3rd Challenge) - NLTMP

The data set comprises of Hindi read and conversational speech data along with the corresponding transcriptions. This speech data was collected by Speech Lab IITM and several startups. The text data w...

Contributor:  ASR Consortia
Tags:  Hindi, ASR Challenge Data, ASR Speech Data, NLTM Pilot, Speech Corpus, Speech, Corpus
Redirect to external website
click here
English-Hindi ,Tamil-Telugu Parallel  Data Developed Under PSA Pilot

English-Hindi ,Tamil-Telugu Parallel Data Developed Under PSA Pilot

English-Hindi , Tamil-Telugu Parallel Data Developed Under PSA Pilot on  SSMT, lead by IIIT-Hyderabad...

Contributor:  NLTM IIIT-Hyderabad
Tags:  English-Hindi , Tamil-Telugu , Parallel Data, IIIT-Hyderabad,NLTM Pilot
Redirect to external website
click here
Hindi -Telugu Domain Dictionary by IIIT-H

Hindi -Telugu Domain Dictionary by IIIT-H

Hindi  and Telugu Domain Dictionary developed under ILMT Hindi-Telugu Pilot by IIIT-Hyderabad (Part1). The Domain of Dictionary is Chemistry and Law. ...

Contributor:  NLTM IIIT-Hyderabad
Tags:  Hindi , Telugu, Dictionary, Hindi and Telugu Domain Dictionary
Redirect to external website
click here
Hindi ASR Challenge Data (ASR Speech Data released under 1st Challenge) - NLTMP

Hindi ASR Challenge Data (ASR Speech Data released under 1st Challenge) - NLTMP

The data set comprises of Hindi read speech data along with the corresponding transcriptions. The text data was crawled from newspapers, and then volunteers were asked to read them. It covers genres l...

Contributor:  ASR Consortia
Tags:  Hindi, ASR Challenge Data, ASR, Speech Data, NLTM Pilot
Redirect to external website
click here
Hindi–Telugu Parallel Text Corpus  IIIT-Hyd

Hindi–Telugu Parallel Text Corpus IIIT-Hyd

Hindi – Telugu Parallel Text corpus developed Under NLTM Pilot by IIIT-Hyderabad. The domain of corpus is Chemistry, Law, News & General, Health-Care, Education, Open Education...

Contributor:  NLTM IIIT-Hyderabad
Tags:   NLTM Pilot, Hindi, Telugu, Hindi–Telugu, Parallel, Text Corpus
Redirect to external website
click here
Hindi Annotated  Text Corpus - IIIT Hyderabad

Hindi Annotated Text Corpus - IIIT Hyderabad

Hindi Annotated corpus developed Under NLTM Pilot by IIIT-Hyderabad (Part1). Domains of the Corpus are Chemistry, Law, News & General,HealthCare, Education Others, open education books....

Contributor:  NLTM IIIT-Hyderabad
Tags:  NLTM Pilot, Hindi, Telugu, Hindi–Telugu, Annotated, Text Corpus , IIIT-Hyderabad
Redirect to external website
click here
e-Aksharayan – Hindi OCR

e-Aksharayan – Hindi OCR

e-Aksharayan is a Desktop software for converting scanned printed Indian Language documents into a fully editable text format in Unicode encoding. Works on Windows 7,8, and 10. Input and output speci...

Contributor:  OCR Consortia
Tags:  e-Aksharayan, Hindi OCR, Hindi, OCR
Redirect to external website
click here
HINDI Speech Data – ASR

HINDI Speech Data – ASR

This corpus contains the more than 194714 audio files of HINDI language of approx. 1000 native speakers. This corpus also contains word and its corresponding phonetic representation and transcrip...

Contributor:  ASR Consortia
Tags:  ASR, HINDI, Speech Data
Redirect to external website
click here
Hindi-Magahi parallel data set

Hindi-Magahi parallel data set

This Hindi-Magahi parallel data set, having total 1000 sentences (500 dev, 500 test) has been release under license: CC BY-NC-SA 4.0 by Panlingua Language Processing LLP, New Delhi, India....

Contributor:  Panlingua Language Processing LLP
Tags:  Hindi, Magahi, Parallel Text Corpus
Redirect to external website
click here
Hindi-Bhojpuri parallel data set

Hindi-Bhojpuri parallel data set

This Hindi-Bhojpuri parallel data set, having total 1000 sentences (500 dev, 500 test) has been release under license: CC BY-NC-SA 4.0 by Panlingua Language Processing LLP, New Delhi, India....

Contributor:  Panlingua Language Processing LLP
Tags:  Hindi, Bhojpuri, Parallel, Text Corpus
Redirect to external website
click here
Hindi Monolingual Data Set

Hindi Monolingual Data Set

This Hindi monolingual data set, having 473605 sentences and total word count of 7092870, has been release under license: CC BY-NC-SA 4.0 by Panlingua Language Processing LLP, New Delhi, India....

Contributor:  Panlingua Language Processing LLP
Tags:  Hindi, Monolingual, Text Corpus
Redirect to external website
click here
HINDI (JHARKHAND) Speech Data – ASR

HINDI (JHARKHAND) Speech Data – ASR

This corpus contains the more than 36694 audio files of HINDI (JHARKHAND)  language of approx. 1000 native speakers. This corpus also contains word and its corresponding phonetic representation a...

Contributor:  ASR Consortia
Tags:  ASR, HINDI (JHARKHAND), Speech Data
Redirect to external website
click here
Hindi Monolingual Chunked Text Corpus ILCI

Hindi Monolingual Chunked Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected monolingual corpus in Hind...

Contributor:  ILCI Consortia
Tags:  Hindi, Monolingual, Chunked Tagged, Text Corpus
Redirect to external website
click here
Hindi Monolingual PoS Tagged Text Corpus ILCI

Hindi Monolingual PoS Tagged Text Corpus ILCI

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected monolingual corpus in Hind...

Contributor:  ILCI Consortia
Tags:  Hindi, Monolingual, PoS Tagged, Text Corpus
Redirect to external website
click here
Information
  • About NPLT
  • Privacy Policy
  • Return Policy
  • Terms & Conditions
  • MeitY Linguistic Resource Sharing Policy
Customer Service
  • Contact Us
  • Website Survey
  • Feedback
  • FAQs
  • Site Map
Imp Links
  • National Portal of India
  • MeitY
  • TDIL Programme
  • TDIL-DC
  • Language Technology Players
My Account
  • My Account
  • Order History
  • Save for Later
  • Newsletter
National Portal link
MeitY Website link
Digital India Website link
TDIL logo
CDAC logo

Copyright @ All Rights Reserved
National Platform for Language Technology © 2025