Sandhan is a mission mode project under TDIL Programme. Its main objective is to develop a monolingual search system for tourism domain in five Indian languages viz., Bengali, Hindi, Marathi, Tamil and Telugu. Click here to access Sandhan.
Features
The system has been developed to satisfy the user information need in tourism domain.
A user has the facility to submit a query by typing using the INSCRIPT or phonetic layout. Queries can also be submitted by clicking on the on-screen keyboard provided for the INSCRIPT layout.
Sandhan has the capability to process the query based on its language and retrieve results from the respective language.
Snippets generated for each of the retrieved documents, help the user to understand the context of query terms in that document.
Summary gets generated for each retrieved document and this feature helps the user, to get an idea about the overall content of the document without opening it.
An additional UNL based semantic search facility has been provided for Tamil language.
A set of ten results gets display at a time to the user, to increase the readability.
Many of the Indian language web pages, are in custom fonts that make the system difficult for retrieving documents. Sandhan uses a font transcoder that converts the custom fonts into Unicode fonts for processing.
Benefits
The system will enable searching Indian language content and thus address the gap that exists in fulfilling the information need of the huge Indian population not conversant with English- estimated at 10% of the population. The expected beneficial sectors are academia, tourism and business.