AnglaMT System

Machine-aided Translation for translating English to Indian languages

AnglaMT is a machine-aided translation methodology specifically designed for translating English to Indian languages. AnglaMT is a pattern directed rule based system with context free grammar like structure for English (source language).

It generates a `pseudo-target' (Pseudo-Interlingua) applicable to a group of Indian languages (target languages) such as Indo-Aryan family (Hindi, Bangla, Asamiya, Punjabi, Marathi, Oriya, Gujrati etc.), Dravidian family (Tamil, Telugu, Kannada & Malayalam) and others.

Some of the major design considerations of AnglaMT have been aimed at providing a practical aid for translation wherein an attempt is made to get 90% of the task done by the machine and 10% left to the human post-editing:

1. a system which could grow incrementally to handle more complex situations;
2. an uniform mechanism by which translation from English to majority of Indian languages with attachment of appropriate text generator modules; and
3. a human engineered man-machine interface to facilitate both its usage and augmentation.

A set of rules obtained through corpus analysis are used to identify plausible constituents with respect to which movement rules for the `pseudo-target' is constructed. Within each group the languages exhibit a high degree of structural homogeneity. System exploit the similarity to a great extent in our system. A language specific text-generator converts the 'pseudo-target' code into target language text. Paninian framework based on Sanskrit grammar using Karak (similar to case) relationship provides an uniform way of designing the Indian language text generators. System also use an example-base to identify noun and verb phrasals and resolve their semantics. An attempt is made to resolve most of the ambiguities using ontology, syntactic & semantic tags and some pragmatic rules. The unresolved ambiguities are left for human post-editing.