Search and Browse – PORTULAN CLARIN

Terms for Digital Marketing

Resource Type:	Lexical / Conceptual
Media Type:	Text
Languages:	Catalan; Valencian
	English
	French
	Galician
	German
	Italian
	Portuguese
	Spanish; Castilian

Termcat Fairs and Congresses

Terms for Fairs and Congresses

Resource Type:	Lexical / Conceptual
Media Type:	Text
Languages:	Catalan; Valencian
	English
	French
	German
	Italian
	Portuguese
	Spanish; Castilian

Termcat Exotic Wood

Terms of Exotic Wood

Resource Type:	Lexical / Conceptual
Media Type:	Text
Languages:	Catalan; Valencian
	English
	French
	German
	Italian
	Portuguese
	Spanish; Castilian

Termcat Economical Crisis

Economical Crisis terms

Resource Type:	Lexical / Conceptual
Media Type:	Text
Languages:	Catalan; Valencian
	English
	French
	German
	Italian
	Portuguese
	Spanish; Castilian

QTLeap News Corpus

This corpus is a sample extracted from the corpus made available by the annual workshops/conferences on Statistical Machine Translation (WMT, see \url{http://www.statmt.org/}) from the News domain. To this end, 1104 English sentences and their corresponding human translations into Czech, German a...

Resource Type:	Corpus
Media Type:	Text
Languages:	Basque
	Bulgarian
	Czech
	Dutch; Flemish
	English
	German
	Portuguese
	Spanish; Castilian

Parallel texts from Swedish Work environment Authority (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel texts from the Swedish Work Environment authori...

Resource Type:	Corpus
Media Type:	Text
Languages:	Bulgarian
	Czech
	English
	Estonian
	Finnish
	French
	German
	Greek, Modern (1453-)
	Hungarian
	Italian
	Latvian
	Lithuanian
	Polish
	Romanian
	Spanish; Castilian
	Swedish

Parallel corpora finely aligned (subsentencial granularity)

Text corpus for bilingual concordancing, single- and multi-word translation extraction, machine translation. Languages: cs-pt, de-pt, en-pt, es-pt, fr-pt, it-pt, and pt-sk. Size: 1 G per language (phrases aligned). Domain: Law and Health.

Resource Type:	Corpus
Media Type:	Text
Languages:	Czech
	English
	French
	German
	Italian
	Portuguese
	Slovak
	Spanish; Castilian

XGLUE benchmark dataset

XGLUE is a new benchmark dataset to evaluate the performance of cross-lingual pre-trained models with respect to cross-lingual natural language understanding and generation. XGLUE is composed of 11 tasks spans 19 languages. For each task, the training data is only available in English. This me...

Resource Type:	Corpus
Media Type:	Text
Languages:	Arabic
	Bulgarian
	Chinese
	Dutch; Flemish
	English
	French
	German
	Greek, Modern (1453-)
	Hindi
	Italian
	Polish
	Portuguese
	Russian
	Spanish; Castilian
	Swahili
	Thai
	Turkish
	Urdu
	Vietnamese

Parallel corpora

Parallel corpora is a set of parallel texts in the domain of Law and Health, with 1 G per language. Languages: cs-pt, de-pt, en-pt, es-pt, fr-pt, it-pt, and pt-sk.

Resource Type:	Corpus
Media Type:	Text
Languages:	Arabic
	Chinese
	Czech
	English
	French
	German
	Portuguese
	Spanish; Castilian

Multilingual corpus from the Publications Office of the EU on the medical domain v.2

277780 sentence pairs (in 23 EN-X language pairs in total) extracted from the Publications Office of the EU on the medical domain. These are sourced from laws, studies, EC announcements, etc. labelled with concepts like epidemiology, epidemic, disease surveillance, health control, public hygiene,...

Resource Type:	Corpus
Media Type:	Text
Languages:	Bulgarian
	Croatian
	Czech
	Danish
	Dutch; Flemish
	English
	Estonian
	Finnish
	French
	German
	Greek, Modern (1453-)
	Hungarian
	Irish
	Italian
	Latvian
	Lithuanian
	Maltese
	Polish
	Portuguese
	Romanian
	Slovak
	Slovenian
	Spanish; Castilian
	Swedish

Order by:

Filter by: