COVID-19 Parallel Global Voices dataset. Bilingual (EN-PT)

EN-PT Bilingual COVID-19-related corpus acquired from the website (https://globalvoices.org/) of GlobalVoices (28th April 2020)

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 EU presscorner v2 dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from website (https://ec.europa.eu/commission/presscorner/) of the EU portal (8th July 2020).

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 EC-EUROPA v1 dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from website (https://ec.europa.eu/*coronavirus-response) of the EU portal (20th May 2020).

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 EUR-LEX dataset. Βilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from website (https://eur-lex.europa.eu/legal-content) of the EU portal (9th July 2020)

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 - HEALTH Wikipedia dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from Wikipedia on health and COVID-19 domain (2nd May 2020)

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
Portuguese-English bilingual corpus from the Portuguese Constitution (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Complete text of the Portuguese Constitution in Portugue...

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
Portuguese RoBERTa language model

HuggingFace (pytorch) pre-trained roBERTa model in Portuguese, with 6 layers and 12 attention-heads, totaling 68M parameters. Pre-training was done on 10 million Portuguese sentences and 10 million English sentences from the Oscar corpus. Please cite: Santos, Rodrigo, João Rodrigues, Antóni...

Resource Type:Language Description
Media Type:Text
Languages:English
Portuguese
Romanian - English New Criminal Procedure Code (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. The New Civil Procedure Code in Romanian and English (bi...

Resource Type:Corpus
Media Type:Text
Languages:English
Romanian
Romanian - English news corpus (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual Romanian – English news corpus built from Sout...

Resource Type:Corpus
Media Type:Text
Languages:English
Romanian
Romanian - English literature corpus (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual Romanian – English literature corpus built fro...

Resource Type:Corpus
Media Type:Text
Languages:English
Romanian

Order by:

Filter by: