Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates
Europarl-ST is a multilingual speech translation corpus composed of paired audio-text samples, based on recordings of European Parliament debates in the period between 2008 and 2012. It contains audio-transcription-translation triples from and into 9 European languages, which means a total of 72 different translation directions. The corpus is published under Creative Commons license and is freely accessible and downloadable. Full details of the corpus are available in the article "Europarl-ST: A Multilingual Corpus For Speech Translation Of Parliamentary Debates" (Iranzo-Sánchez et al., ICASSP 2020).
Development of speech translation systems. Development of automatic translation systems. Development of voice recognition systems.
Automatic Speech Recognition (ASR) * Automatic Translation (MT) * Speech Synthesis.