ParaCrawl release 7 Portuguese-English

Portuguese-English parallel from release 7 of the ParaCrawl project, specifically "Broader Web-Scale Provision of Parallel Corpora for European Languages". This version is filtered with BiCleaner with a threshold of 0.5. Data was crawled from the web following robots.txt, as is standard practice....

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 EC-EUROPA v1 dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from website (https://ec.europa.eu/*coronavirus-response) of the EU portal (20th May 2020).

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 - HEALTH Wikipedia dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from Wikipedia on health and COVID-19 domain (2nd May 2020)

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
COVID-19 ANTIBIOTIC dataset. Bilingual (EN-PT)

Bilingual (EN-PT) corpus acquired from the website https://antibiotic.ecdc.europa.eu/

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese
CRPC-Quotations

Database with 2.253 citations extracted from the Corpus de Referência do Português Contemporâneo - CRPC (Reference Corpus of Contemporary Portuguese) and manually revised. Format: tab separated file Fields: - context number - source file id - citation

Resource Type:Corpus
Media Type:Text
Language:Portuguese
ArgMine Corpus

A corpus of opinion articles annotated with arguments, following a claim-premise model.

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Archivo dos Açores, dir. Ernesto do Canto, 1.ª série, Ponta Delgada, Vol. 1-12

A publicação Arquivo dos Açores, consagrada como obra de referência para a investigação histórica sobre o arquipélago dos Açores, conta com duas séries, num total de 20 volumes. A primeira série do Arquivo dos Açores, composta por 15 volumes, decorreu entre 1878 e 1959, com grandes interrupções r...

Resource Type:Corpus
Media Type:Text
Language:Portuguese

Order by:

Filter by:

Portuguese (119)
English (31)
Czech (14)
Bulgarian (13)
German (12)
French (10)
Italian (9)
Polish (7)
Slovak (7)
Basque (6)
Danish (6)
Finnish (6)
Irish (6)
Latvian (6)
Maltese (6)
Swedish (6)
Chinese (3)
Arabic (2)
Hindi (1)
Russian (1)
Swahili (1)
Thai (1)
Turkish (1)
Urdu (1)