Parallel corpora is a set of parallel texts in the domain of Law and Health, with 1 G per language. Languages: cs-pt, de-pt, en-pt, es-pt, fr-pt, it-pt, and pt-sk.
Multilingual (CEF languages) corpus acquired from website (https://eur-lex.europa.eu/legal-content) of the EU portal (9th July 2020). It contains 23 TMX files (EN-X, X is a CEF language) with 475,931 translation units pairs in total.
Filter by:
Czech (12)
German (12)
Spanish; Castilian (12)
English (12)
Portuguese (11)
Bulgarian (9)
French (9)
Dutch; Flemish (8)
Italian (8)
Estonian (7)
Finnish (7)
Hungarian (7)
Latvian (7)
Lithuanian (7)
Polish (7)
Romanian (7)
Slovak (7)
Swedish (7)
Croatian (6)
Danish (6)
Irish (6)
Maltese (6)
Slovenian (6)
Basque (3)
Arabic (1)
Chinese (1)
Icelandic (1)