webpage1

The dataset consist of four files one pair each for training and validation. The data collection is still in progress and the current version contains 19003 sentence pairs for training and 1000 sentence pairs for validation.

You can download the dataset from here. It is released under the Creative Commons license and copyright terms.

Hindi-Gondi Parallel Corpus