•  Download
  • FAQ
  • Register
  • Login
  
Skip to Main Content
A- A A+
  • standard view
  • contrast view
Welcome to our portal with new look.This portal has been redesigned to improve user friendliness and appeal. If you have questions, comments or suggestions please give us your valuable feedback

 

tdildc-logo.png

Indian Language Technology Proliferation & Deployment Centre
भारतीय भाषा प्रौद्योगिकी प्रसरण एवं विस्तारण केंद्र

TDIL-DC Banner

Main Menu

  • Home
  • Main Areas
    • Standardization
    • Validators & Localization
    • Linguistic Resources & Tools
    • Application Showcase
    • Technology Handshake
    • IPR
    • Research Areas
  • Publications
  • Upload
  • Download
  • Community
  • Useful Links
  • Student Corner
  • Digital Publishing
  • Help Center

  • Home
  • Main Areas
    • Standardization
    • Validators & Localization
    • Linguistic Resources & Tools
    • Application Showcase
    • Technology Handshake
    • IPR
    • Research Areas
  • Publications
  • Upload
  • Download
  • Community
  • Useful Links
  • Student Corner
  • Digital Publishing
  • Help Center
  1. You are here:  
  2. Home
  3. Portal Search

Portal Search

Search Keyword : Text Corpora Total : 91 results found
  • Start
  • Prev
  • 1
  • 2
  • 3
  • 4
  • 5
  • Next
  • End
  Corpus Tool
The Corpus Tool is a utility tool that can be used to view, update and analyze corpus data. It has three tools: Sinnalemba, the Corpus Manager, manages the text data of the parallel corpora. Kupyengba, ...
Tool   License Type: Research
 
  Sanskrit Monolingual Text Corpora
Sanskrit Hindi Machine Translation consortium has developed corpora and it is annotated at various levels such as POS, Sandhi split , Samaasa tagging. Tagging Guidelines are also provided.
Linguistic Resources   License Type: Research
 
  Hindi-Odia Health Text Corpus-ILCI
Under the Indian Languages Corpora Initiative (ILCI) project initiated by the DeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as the source language and translated ...
Linguistic Resources   License Type: Research
 
  Hindi-English Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Assamese Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Bangla Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Bodo Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Gujarati Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Kannada Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Malayalam Agriculture & Entertainment Text Corpus-ILCI II
... sentences have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding, and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Marathi Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Konkani Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Nepali Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Odia Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Punjabi Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Tamil Agriculture & Entertainment Text Corpus-ILCI II
... sentences have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding, and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Urdu Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Telugu Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding and text file format.  ...
Text Corpora   License Type: Research
 
  Hindi-Manipuri Agriculture & Entertainment Text Corpus ILCI-II
... have been POS tagged according to BIS (Bureau of Indian Standards) tagset and are in Meetei Mayek script. This corpus has following features: unique ID, UTF-8 encoding and text file format. Download link ...
Text Corpora   License Type: Research
 
  Assamese Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format. ...
Text Corpora   License Type: Research
 
  • Start
  • Prev
  • 1
  • 2
  • 3
  • 4
  • 5
  • Next
  • End
  • Past Events 
  • Statistics Dashboard 
  • Awards 
  • Sitemap 
  • Take a Poll 
  • Contact Us 
  • About us
  • Feedback
  • Policies
  • Terms & Conditions
  • Link to us
  • Contact us


 email: tdildc[at]cdac[dot]in


indiaGov MeitY Digital India

© 2022 Ministry of Electronics & Information Technology, MeitY, Govt. Of India

Site designed, developed & maintained by CDAC GIST, Pune