•  Download
  • FAQ
  • Register
  • Login
  
Skip to Main Content
A- A A+
  • standard view
  • contrast view
Welcome to our portal with new look.This portal has been redesigned to improve user friendliness and appeal. If you have questions, comments or suggestions please give us your valuable feedback

 

tdildc-logo.png

Indian Language Technology Proliferation & Deployment Centre
भारतीय भाषा प्रौद्योगिकी प्रसरण एवं विस्तारण केंद्र

TDIL-DC Banner

Main Menu

  • Home
  • Main Areas
    • Standardization
    • Validators & Localization
    • Linguistic Resources & Tools
    • Application Showcase
    • Technology Handshake
    • IPR
    • Research Areas
  • Publications
  • Upload
  • Download
  • Community
  • Useful Links
  • Student Corner
  • Digital Publishing
  • Help Center

  • Home
  • Main Areas
    • Standardization
    • Validators & Localization
    • Linguistic Resources & Tools
    • Application Showcase
    • Technology Handshake
    • IPR
    • Research Areas
  • Publications
  • Upload
  • Download
  • Community
  • Useful Links
  • Student Corner
  • Digital Publishing
  • Help Center
  1. You are here:  
  2. Home
  3. Portal Search

Portal Search

Search Keyword : Text Corpora Total : 92 results found
  • Start
  • Prev
  • 1
  • 2
  • 3
  • 4
  • 5
  • Next
  • End
  Bangla Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format. ...
Text Corpora   License Type: Research
 
  Bodo Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format. ...
Text Corpora   License Type: Research
 
  English Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format.  ...
Text Corpora   License Type: Research
 
  Gujarati Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format. ...
Text Corpora   License Type: Research
 
  Hindi-Odia Tourism Text Corpus-ILCI
Under the Indian Languages Corpora Initiative (ILCI) project initiated by the DeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected corpus in Hindi as the source language and translated ...
Text Corpora   License Type: Research
 
  Kannada Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format. ...
Text Corpora   License Type: Research
 
  Konkani Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format. ...
Text Corpora   License Type: Research
 
  Malayalam Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format. ...
Text Corpora   License Type: Research
 
  Marathi Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format. ...
Text Corpora   License Type: Research
 
  Nepali Monolingual Text Corpus ILCI-II
... corpus has following features: unique ID, UTF-8 encoding, and text file format.  ...
Text Corpora   License Type: Research
 
  HMM Based Chunker for Hindi
This paper presents an HMM-based chunk tagger for Hindi. Various tagging schemes for marking chunk boundaries are discussed along with their results. Contextual information is incorporated into the chunk ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s):Akshay Singh,S M Bendre,Rajeev Sangal
 
  Development of Speech Corpora in Gujarati and Marathi for Phonetic Transcription.
There have been growing interest to use speech technology for rural areas. In this context, this paper describes the development of speech corpora in Indian languages (viz., Gujarati and Marathi from remote ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s):Kewal D. Malde, Bhavik B. Vachhani, Maulik C. Madhavi, Nirav H. Chhayani , Hemant A. Patil
 
  Do not do processing, when you can look up: Towards a Discrimination Net for WSD
Abstract: The task of Word Sense Disambiguation (WSD) incorporates in its definition the role of ‘context’. We present our work on the development of a tool which allows for automatic acquisition and ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s):Pushpak Bhattacharyya,Diptesh Kanojia, Raj Dabre ,Siddhartha Gunti ,Manish Shrivastava
 
  Automated Grammar Correction Using Hierarchical Phrase-Based Statistical Machine Translation
... Thus grammar correction can be considered a translation problem from incorrect text to correct text. Over the years, grammar correction data in the electronic form (i.e., parallel corpora of incorrect ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s):Bibek Behera,Pushpak Bhattacharyya
 
  Automatically Predicting Sentence Translation Difficulty
In this paper we introduce Translation Difficulty Index (TDI), a measure of difficulty in text translation. We first define and quantify translation difficulty in terms of TDI. We realize that any measure ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s): Abhijit Mishra, Pushpak Bhattacharyya,Michael Carl
 
  An Integrated Digital Tool for Accessing Language Resources
... the results in KeyWord-In-Context (KWIC) format. We also present the notation used for querying and transformation, which is comparable to but different from the Corpus Query Language (CQL). For Full ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s):Anil Kumar Singh,Bharat Ambati
 
  Neighbors Help: Bilingual Unsupervised WSD Using Context
... poor on verbs with accuracy level at 25-38%. We suggest a modification to this mentioned formulation, using context and semantic relatedness of neighboring words. An improvement of 17% -35% in the accuracy ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s):Sudha Bhingardive, Samiulla Shaikh,Pushpak Bhattacharyya
 
  Phrase Based Decoding using a Discriminative Model
... to incorporate greater contextual and linguistic information), which leads to an effective training of these models. This model is then used by the standard state-of-art Moses decoder (Koehn et al., 2007) ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s):Prasanth Kolachina,V Sriram,Srinivas Bangalore,Sudheer Kolachina Sudheer Kolachina, Avinesh PVS
 
  Sentiment Analysis in Twitter with Lightweight Discourse Analysis
We propose a lightweight method for using discourse relations for polarity detection of tweets. This method is targeted towards the web-based applications that deal with noisy, unstructured text, like ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s):Subhabrata Mukherjee,Pushpak Bhattacharyya
 
  Cross-Lingual Sentiment Analysis for Indian Languages using Linked WordNets
Cross-Lingual Sentiment Analysis (CLSA) is the task of predicting the polarity of the opinion expressed in a text in a language Ltest using a classifier trained on the corpus of another language Ltrain. ...
  License Type: Freeware
 
Research Paper   Paper Type: Conference Papers  License Type: Freeware  Author(s):Balamurali A R, Aditya Joshi,Pushpak Bhattacharyya
 
  • Start
  • Prev
  • 1
  • 2
  • 3
  • 4
  • 5
  • Next
  • End
  • Past Events 
  • Statistics Dashboard 
  • Awards 
  • Sitemap 
  • Take a Poll 
  • Contact Us 
  • About us
  • Feedback
  • Policies
  • Terms & Conditions
  • Link to us
  • Contact us


  Toll Free number: 1800 209 1015

 email: tdildc[at]cdac[dot]in


indiaGov DeitY Digital India
CDAC logo
© 2009-2020 Ministry of Electronics & Information Technology , MeitY, Govt. Of India | Site designed, developed & maintained by CDAC GIST, Pune