Search and Browse – PORTULAN CLARIN

Parallel Global Voices (Greek - English) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel Global Voices EL-EN is a parallel corpus genera...

Resource Type:	Corpus
Media Type:	Text
Languages:	English
Languages:	Greek, Modern (1453-)

QTLeap News Corpus

This corpus is a sample extracted from the corpus made available by the annual workshops/conferences on Statistical Machine Translation (WMT, see \url{http://www.statmt.org/}) from the News domain. To this end, 1104 English sentences and their corresponding human translations into Czech, German a...

Resource Type:	Corpus
Media Type:	Text
Languages:	Basque
	Bulgarian
	Czech
	Dutch; Flemish
	English
	German
	Portuguese
	Spanish; Castilian

Parallel corpora finely aligned (subsentencial granularity)

Text corpus for bilingual concordancing, single- and multi-word translation extraction, machine translation. Languages: cs-pt, de-pt, en-pt, es-pt, fr-pt, it-pt, and pt-sk. Size: 1 G per language (phrases aligned). Domain: Law and Health.

Resource Type:	Corpus
Media Type:	Text
Languages:	Czech
	English
	French
	German
	Italian
	Portuguese
	Slovak
	Spanish; Castilian

Carolina: General Corpus of Contemporary Brazilian Portuguese with provenance and typology information

Carolina is an open corpus for Linguistics and Artificial Intelligence with a robust volume of texts of varied typology in contemporary Brazilian Portuguese (1970-2021).

Resource Type:	Corpus
Media Type:	Text
Language:	Brazilian Portuguese

Order by:

Filter by: