Your cart is empty!
0 reviews / Write a review
Available Under License: Commercial Research
Under the Indian Languages Corpora Initiative (ILCI) project, ILCI Consortia led by Jawaharlal Nehru University, New Delhi has created parallel corpus, Hindi as source language and translated in Bengali as the target language. Health, Tourism, Agriculture and Entertainment domains have been covered in this corpus. This corpus has a unique sentence ID for each sentence and complete corpus is in UTF-8 encoding. The translated sentences have been POS tagged (as per Bureau of Indian Standards - BIS tagset) and Chunked properly. The chunking guideline used in this corpus creation, is also provided.
Tags: Hindi, Bengali, Bangla, Text Corpus, Chunked, Parallel text corpus