Bilingual (EN-PT) corpus acquired from website (https://ec.europa.eu/commission/presscorner/) of the EU portal (8th July 2020).
The Portuguese Parliamentary Corpus is part of the Mutlilingual ParlaMint Corpus, a set of comparable corpora containing transcriptions of parliamentary debates of 29 European countries and autonomous regions. The Portuguese corpus (ParlaMint-PT) comprehends transcripts of sessions in the time pe...
Gervásio PT-* is a foundation, large language model for the Portuguese language. It is a decoder of the GPT family, based on the neural architecture Transformer and developed over the Pythia model, with competitive performance for this language. It has different versions that were trained for ...
Gervásio PT-* is a foundation, large language model for the Portuguese language. It is a decoder of the GPT family, based on the neural architecture Transformer and developed over the Pythia model, with competitive performance for this language. It has different versions that were trained for ...
The resource consists of a Portuguese frequency lexicon based on a 16 million words corpus of written and spoken texts from different genres. The lexicon contains 26.443 entries (lemma) and 140
Grafone-Tool is a tool for conversion from grapheme to phoneme for European Portuguese. The converter works with the Portuguese spelling, both prior to and after the Orthographic Agreement of 1990.
VIDiom-PT is a European Portuguese corpus annotated for verbal idioms, designed to support NLP applications in idiom processing. The resulting corpus comprises 5,178 annotated instances covering 747 distinct verbal idioms. The annotation process was validated through an inter-annotator agreement ...