Polish Ministry of Foreign Affairs Historical Dataset (Processed)
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu.
A collection of parallel Polish-English texts published by the Polish Ministry of Polish Affairs. Sentence-level alignment of translation segments was carried out manually and encoded in the XLiFF format.
There are three publications in the collection
a) Nazi Concentration Camps (obozy2014.xlf, 398 segments 14146 words),
b) A Guide to History of Poland (przewodnik_po_historii_polski.xlf, 828 segments, 25572 words) and
c) The Katyn Crime (zbrodnia_katyn_xlf, 1455 segments, 66396 words).
The total size of the collection is 106 114 words in 2681 parallel segments.
It was converted into a 2223-TUs English-Polish resource in TMX format.