•    Freeware
  •    Shareware
  •    Research
  •    Localization Tools 20
  •    Publications 696
  •    Validators 2
  •    Mobile Apps 22
  •    Fonts 31
  •    Guidelines/ Draft Standards 3
  •    Documents 13
  •    General Tools 38
  •    NLP Tools 105
  •    Linguistic Resources 251

Search Results | Total Results found :   251

You refine search by :    Linguistic Resources      Handwritten Data      Document Image Corpora      Text Corpora      Named Entity Resources      Dictionary      Lexicon      Speech Corpora  
  Catalogue
BIS standard "IS 16333 (Part 3)" defines the requirements for mobile handset for inputting of text in English, Hindi and at least one additional official Indian language along with facility of message readability in the phones for all 22 Indian official languages. So to help the mobile manufacturer in the internal verification and to check the effectiveness of language support, TDIL along with CDAC-GIST have prepared a robust test data covering relevant language Consonant (C), Vowels (V), Numerals (N), Matras(M), Halant (H), Diacritic(D), combinations of C, V, N, M, H, D along with word list and sentences. Test data, thus created can be used to test the inputting and display on the mobile handsets.
For best view download SakalBharati Font.

Added on July 27, 2018

135
0

  More Details
  • Contributed by : CDAC- GIST, TDIL
  • Product Type : Linguistic Resources
  • License Type : Freeware
  • System Requirement : Not Applicable

Under the Indian Languages Corpora Initiative phase –II (ILCI Phase-II) project, initiated by the MeitY, Govt. of India, Jawaharlal Nehru University, New Delhi had collected monolingual corpus in Telugu. This is the final outcome of the project and there are approx. 32,000 sentences of general domain. The translated sentences have been POS tagged according to BIS (Bureau of Indian Standards) tagset. This corpus has following features: unique ID, UTF-8 encoding, and text file format.mat.

Added on May 31, 2017

2
73

  More Details
  • Contributed by : ILCI- II, JNU
  • Product Type : Linguistic Resources
  • License Type : Research
  • System Requirement : Not Applicable

Hindi Named Entity and Multi Word Expression List are developed in Unicode under Sandhan (CLIA) Consortium. Its a monolingual search system for tourism domain and the provided resources were used in the work for translating queries.

Added on November 24, 2016

2
34

  More Details
  • Contributed by : CLIA Consortia, IIT Mumbai
  • Product Type : Linguistic Resources
  • License Type : Research
  • System Requirement : Not Applicable

Hindi Translation and Transliteration Word List are developed in Unicode under Sandhan (CLIA) Consortium. Its a monolingual search system for tourism domain and the provided resources were used in the work for translating queries.

Added on November 24, 2016

0
47

  More Details
  • Contributed by : CLIA Consortia, IIT Mumbai
  • Product Type : Linguistic Resources
  • License Type : Research
  • System Requirement : Not Applicable

Gujarati Translation and Transliteration Word List are developed in Unicode under Sandhan (CLIA) Consortium. Its a monolingual search system for tourism domain and the provided resources were used in the work for translating queries from 9 Indian Language to English and Hindi.

Added on November 24, 2016

1
12

  More Details
  • Contributed by : CLIA Consortia, IIT Mumbai
  • Product Type : Linguistic Resources
  • License Type : Research
  • System Requirement : Not Applicable