This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EUIPO list of goods and services format: TMX
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel texts in pdf file format. Original in Swedish, ...
This corpus is a sample extracted from the corpus made available by the annual workshops/conferences on Statistical Machine Translation (WMT, see \url{http://www.statmt.org/}) from the News domain. To this end, 1104 English sentences and their corresponding human translations into Czech, German a...
Multilingual (CEF languages) corpus acquired from the website https://antibiotic.ecdc.europa.eu/ . It contains 20981 TUs (in total) for EN-X language pairs, where X is a CEF language.
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. The Polish-English parallel corpus is composed of three ...
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Polish-English parallel corpus from the website of the I...
Europarl QTLeap WSD/NED corpus This corpora is part of Deliverable 5.5 of the European Commission project QTLeap FP7-ICT-2013.4.1-610516 (http://qtleap.eu). The texts are sentences from the Europarl parallel corpus (Koehn, 2005). We selected the monolingual sentences from parallel corpora ...
142,397 Maltese texts from 10 genres. The file “corpus.zip” expands into a folder “corpus”, containing the file “tagged.zip”, which expands into the folder “cwb.final”. This folder contains the files: • filelist.txt • malti02.academic.txt • malti02.law.txt • malti02.literature.txt • malti...
The corpus contains the Laws of Malta in Maltese from the official government website. The unannotated raw text files were extracted from the pdf files that can be found on the website.
This corpus is part of the collection of the Wikipedia Dumps which was retrieved from wikipedia.org on April 8, 2010. It comes with two individual XML files, one containing the Wikipedia articles and another containing the metadata about it.