The paper describes experiments on a Hindi dependency treebank to systematically investigate crucial learning issues which crop up in building a robust Hindi parser. We do this by training two data-driven dependency parsers on the treebank. We test out various conjectures through these experiments. The results obtained either validate or make us to reframe the conjectures posed. The whole process helps in systematically isolating information crucial for parsing. Many interesting facts, such as how certain intuitive features fail to increase the performance of the parsers, what kind of linguistic phenomena are difficult to learn, how minimal semantics can help in identifying some core relations, etc. accrue from these experiments. The final performance obtained for parsing Hindi is encouraging, the best labeled attachment and unlabelled attachment scores are 69.64% and 88.67% respectively on a Treebank as small as 1200 sentences.
For Full Paper: Click Here