Letter of rights for persons arrested and or detained (Processed)  

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Collection of transaltion units (1906 in total) in 21 la...

Resource Type:Corpus
Media Type:Text
Languages:Bulgarian
English
French
Greek, Modern (1453-)
Latvian
Polish
Romanian
Parallel texts from Swedish Labour market agency. Part 2 (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Same as part 1, but with the Readme-file. (Processed)

Resource Type:Corpus
Media Type:Text
Languages:English
Finnish
French
German
Polish
Romanian
Spanish; Castilian
Swedish
Parallel texts from Swedish National Food Agency (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel texts in pdf file format. Original in Swedish, ...

Resource Type:Corpus
Media Type:Text
Languages:English
Finnish
French
Polish
Spanish; Castilian
Swedish
Parallel corpora finely aligned (subsentencial granularity)

Text corpus for bilingual concordancing, single- and multi-word translation extraction, machine translation. Languages: cs-pt, de-pt, en-pt, es-pt, fr-pt, it-pt, and pt-sk. Size: 1 G per language (phrases aligned). Domain: Law and Health.

Resource Type:Corpus
Media Type:Text
Languages:Czech
English
French
German
Italian
Portuguese
Slovak
Spanish; Castilian
XGLUE benchmark dataset

XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained models with respect to cross-lingual natural language understanding and generation. XGLUE is composed of 11 tasks spans 19 languages. For each task, the training data is only available in English. This me...

Resource Type:Corpus
Media Type:Text
Languages:Arabic
Bulgarian
Chinese
Dutch; Flemish
English
French
German
Greek, Modern (1453-)
Hindi
Italian
Polish
Portuguese
Russian
Spanish; Castilian
Swahili
Thai
Turkish
Urdu
Vietnamese
Termcat Exotic Wood

Terms of Exotic Wood

Resource Type:Lexical / Conceptual
Media Type:Text
Languages:Catalan; Valencian
English
French
German
Italian
Portuguese
Spanish; Castilian
COVID-19 ANTIBIOTIC dataset. Multilingual (CEF languages)

Multilingual (CEF languages) corpus acquired from the website https://antibiotic.ecdc.europa.eu/ . It contains 20981 TUs (in total) for EN-X language pairs, where X is a CEF language.

Resource Type:Corpus
Media Type:Text
Languages:Bokmål, Norwegian; Norwegian Bokmål
Bulgarian
Croatian
Czech
Danish
Dutch; Flemish
English
Estonian
Finnish
French
German
Greek, Modern (1453-)
Hungarian
Icelandic
Irish
Italian
Latvian
Lithuanian
Maltese
Moldavian; Moldovan
Polish
Portuguese
Romanian
Slovak
Slovenian
Spanish; Castilian
Swedish
Termcat Neoloteca

Terms that have (more or less) recently been accepted and normalised by Termcat, mixed fields

Resource Type:Lexical / Conceptual
Media Type:Text
Languages:Basque
Catalan; Valencian
English
French
Galician
German
Italian
Latin
Portuguese
Spanish; Castilian
Termcat Research Thesaurus

Terms of Research Thesaurus

Resource Type:Lexical / Conceptual
Media Type:Text
Languages:Catalan; Valencian
English
French
German
Italian
Latin
Portuguese
Spanish; Castilian
Parallel corpora

Parallel corpora is a set of parallel texts in the domain of Law and Health, with 1 G per language. Languages: cs-pt, de-pt, en-pt, es-pt, fr-pt, it-pt, and pt-sk.

Resource Type:Corpus
Media Type:Text
Languages:Arabic
Chinese
Czech
English
French
German
Portuguese
Spanish; Castilian

Order by:

Filter by:

English (30)
French (30)
German (24)
Italian (20)
Polish (13)
Romanian (12)
Finnish (11)
Swedish (11)
Bulgarian (10)
Czech (10)
Latvian (9)
Slovak (7)
Danish (6)
Irish (6)
Maltese (6)
Arabic (2)
Basque (2)
Chinese (2)
Latin (2)
Hindi (1)
Russian (1)
Swahili (1)
Thai (1)
Turkish (1)
Urdu (1)