Portuguese-English bilingual corpus from Legislation concerning the Portuguese Parliament (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Legislation concerning Portuguese Parliament; three bili...

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 EC-EUROPA v1 dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from website (https://ec.europa.eu/*coronavirus-response) of the EU portal (20th May 2020).

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 - HEALTH Wikipedia dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from Wikipedia on health and COVID-19 domain (2nd May 2020)

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 ANTIBIOTIC dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from the website https://antibiotic.ecdc.europa.eu/

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
SpeakerID

SpeakerID is a corpus of 100 spoken sentences and pseudosentences in European Portuguese (PT) and Mandarin Chinese (CH) designed to enable research on speaker identity. The utterances were recorded by five male speakers of European Portuguese (Speakers A-E) and five male speakers of Mandarin Chi...

Resource Type:Corpus
Media Types:Text
Audio
Languages:Chinese
Portuguese
Anonymised ParaCrawl release 7 Portuguese-English

This corpus was run through BiRoamer https://github.com/bitextor/biroamer to anonymise the Portuguese-English parallel data from release 7 of the ParaCrawl project, specifically "Broader Web-Scale Provision of Parallel Corpora for European Languages". This version is filtered with BiCleaner with ...

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
Bilingual corpus made out of PDF documents from the European Medicines Agency, (EMEA), https://www.ema.europa.eu, (February 2020) (EN-PT)

EN-PT Bilingual corpus made out of PDF documents from the European Medicines Agency, (EMEA), https://www.ema.europa.eu, (February 2020).

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 EU presscorner v1 dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from website (https://ec.europa.eu/commission/presscorner/) of the EU portal (14th May 2020).

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 EUROPARL dataset v1. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from the website (https://www.europarl.europa.eu/) of the European Parliament (25th April 2020)

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 EUROPARL v2 dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from the website (https://www.europarl.europa.eu/) of the European Parliament (9th May 2020)

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese