Portuguese-English parallel from release 7 of the ParaCrawl project, specifically "Broader Web-Scale Provision of Parallel Corpora for European Languages". This version is filtered with BiCleaner with a threshold of 0.5. Data was crawled from the web following robots.txt, as is standard practice....
Royal inquiries of 1258 (primarily published in the Portugaliae Monumenta Historica).
CIPM-POS is a set of historical, religious, notarial, literary texts in prose and verse, written is medieval portuguese. It contains around 88000 words.
CIPM is a set of historical, religious, notarial, literary texts in prose and verse, written in medieval portuguese. It has around 3.5 million words.
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Complete text of the Portuguese Constitution in Portugue...
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Legislation concerning Portuguese Parliament; three bili...
Bilingual (EN-PT) corpus acquired from website (https://ec.europa.eu/*coronavirus-response) of the EU portal (20th May 2020).
Bilingual (EN-PT) corpus acquired from website (https://eur-lex.europa.eu/legal-content) of the EU portal (9th July 2020)