This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual collection of documents in the field of teleco...
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A collection of information sheets from the Swedish Crim...
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel Global Voices EL-EN is a parallel corpus genera...
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel aligned corpus in tmx format built from the Rom...
This resource is part of Deliverable 5.7 of the European Comission project QTLeap FP7-ICT-2013.4.1-610516 (http://qtleap.eu). This gazetteer comprises multilingual lexicon entries used for the translation of specific IT domain expressions for Basque, Bulgarian, Czech, Dutch, Engli...
This corpus was run through BiRoamer https://github.com/bitextor/biroamer to anonymise the Portuguese-English parallel data from release 7 of the ParaCrawl project, specifically "Broader Web-Scale Provision of Parallel Corpora for European Languages". This version is filtered with BiCleaner with ...
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Croatian-English parallel corpus from the website of the...
The texts are sentences from the Europarl parallel corpus (Koehn, 2005). The textscontain the monolingual sentences from parallel corpora for the following pairs: Bulgarian-English, Czech-English, Portuguese-English and Spanish- English. The English corpus is comprised by the English side of th...
This corpus is created from documents from translation memorios of Elhuyar Fundation (obtained via Eleka, member of the Advisory Board of Potential Users).
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Dataset of various English-Slovak legal texts within age...