ParaCrawl release 7 Portuguese-English

Portuguese-English parallel from release 7 of the ParaCrawl project, specifically "Broader Web-Scale Provision of Parallel Corpora for European Languages". This version is filtered with BiCleaner with a threshold of 0.5. Data was crawled from the web following robots.txt, as is standard practice....

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
Greek anti-corruption legislation and National Anti-Corruption Plan (greek-english) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Greek laws, ratification of International Conventions ag...

Resource Type:Corpus
Media Type:Text
Languages:English
Greek, Modern (1453-)
Convention on the transfer of sentenced persons (English - Greek) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Convention, additional protocol on the convention, recom...

Resource Type:Corpus
Media Type:Text
Languages:English
Greek, Modern (1453-)
The Coimisineir Teanga Bilingual Corpus of Reports and Press Releases (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Reports and Press Release data from the Language Commiss...

Resource Type:Corpus
Media Type:Text
Languages:English
Irish
The Gaois bilingual corpus of English-Irish legislation (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual corpus of English-Irish legislation provided b...

Resource Type:Corpus
Media Type:Text
Languages:English
Irish
Dataset of Nuanced Assertions on Controversial Issues (NAoCI dataset)

The Dataset of Nuanced Assertions on Controversial Issues (NAoCI) dataset consists of over 2,000 assertions on sixteen different controversial issues. It has over 100,000 judgments of whether people agree or disagree with the assertions, and of about 70,000 judgments indicating how strongly peopl...

Resource Type:Corpus
Media Type:Text
Language:English
EUIPO - list of goods and services Spanish and English (Processed)  

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EUIPO list of goods and services format: TMX

Resource Type:Corpus
Media Type:Text
Languages:English
Spanish; Castilian
SIP Publications (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Publications from the Luxembourgish government edited by...

Resource Type:Corpus
Media Type:Text
Languages:English
French
German
Bilingual hr-en parallel corpus from the Journal of the Croatian Association of Civil Engineers website (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Contents of http://casopis-gradjevinar.hr were crawled, ...

Resource Type:Corpus
Media Type:Text
Languages:Croatian
English
General Romanian-English bilingual corpus (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Romanian – English corpus built from a Wikipedia dump.

Resource Type:Corpus
Media Type:Text
Languages:English
Romanian

Order by:

Filter by: