Home   >   Demonstrators   >   EUROPARL-ST: A MULTILINGUAL CORPUS FOR SPEECH TRANSLATION OF PARLIAMENTARY DEBATES

Europarl-ST: A Multilingual Corpus for Speech Translation of Parliamentary Debates

Demonstrator|vrain

Description
Europarl-ST: Multilingual Corpus for Speech Translation of Parliamentary Debates
Member
Address
Camino de Vera S/N
Province
Valencia

DEMONSTRATOR INFORMATION

DESCRIPTION

Europarl-ST is a multilingual speech translation corpus composed of paired audio-text samples, based on recordings of European Parliament debates in the period between 2008 and 2012. It contains audio-transcription-translation triples from and into 9 European languages, which means a total of 72 different translation directions. The corpus is published under Creative Commons license and is freely accessible and downloadable. Full details of the corpus are available in the article "Europarl-ST: A Multilingual Corpus For Speech Translation Of Parliamentary Debates" (Iranzo-Sánchez et al., ICASSP 2020).

POSSIBILITIES

Development of speech translation systems. Development of automatic translation systems. Development of voice recognition systems.

TECHNOLOGICAL ENABLER

Artificial Intelligence and Computing
Natural language processing and text mining

Automatic Speech Recognition (ASR) * Automatic Translation (MT) * Speech Synthesis.