Gervásio PT-BR base

Gervásio PT-* is a foundation, large language model for the Portuguese language. It is a decoder of the GPT family, based on the neural architecture Transformer and developed over the Pythia model, with competitive performance for this language. It has different versions that were trained for ...

Resource Type:Language Description
Media Type:Text
Language:Portuguese
Gervásio PT-PT base

Gervásio PT-* is a foundation, large language model for the Portuguese language. It is a decoder of the GPT family, based on the neural architecture Transformer and developed over the Pythia model, with competitive performance for this language. It has different versions that were trained for ...

Resource Type:Language Description
Media Type:Text
Language:Portuguese
Albertina PT-BR No-brWaC

Albertina PT-* is a foundation, large language model for the Portuguese language. It is an encoder of the BERT family, based on the neural architecture Transformer and developed over the DeBERTa model, and with most competitive performance for this language. It has different versions that were...

Resource Type:Language Description
Media Type:Text
Language:Portuguese
Corpus de Produções Escritas de Aprendentes de PL2 (PEAPL2)

A Portuguese as a non-native language learners' corpus of written texts with three independent subcorpora: - Portuguese as a Foreign Language: Subcorpus Português Língua Estrangeira (PEAPL2_PLE) http://teitok2.iltec.pt/peapl2-ple/index.php?action=home - East Timorese Portuguese: Subcorpus T...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
CINTIL-Definitions

The corpus presented here is a collection of several tutorials and scientific papers in the field of Information Technology with 603 annotated definitions from Portuguese. The texts were collected from the Web at the beginning of the 2006 and they are organised in 32 files of three different sub-...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Biographies of Portuguese People

This is a set of 11.361 biographies of Portuguese people. The compilation of the data involved the biography collection from wikipedia and data conversion. Several filters were applied to remove entries that were mostly empty or non applicable content. Format: JSON (conversion from HTML) ...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Chancelaria de D. Afonso III: documentos em português

Os documentos em português da Chancelaria de D. Afonso III constituem o primeiro conjunto significativo de textos em português (34 documentos que recobrem um período de 24 anos: 1255 - 1279), sendo apenas a partir de 1279, com D. Dinis (1261-1325), que se inicia o uso sistemático do português co...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
EMOTAIX.PT

EMOTAIX.PT (Costa, 2012) is a database of 3,983 emotional words (nouns, verbs, adjectives and adverbs) in European Portuguese based on the original EMOTAIX in French (Piolat & Bannour, 2009). Each word is classified into three hierarchical levels: Supra Category, Super Category and Basic Category...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
EmoVoicePort

EmoVoicePort, Emotional Vocalization Corpus (see Lima, Castro, & Scott, 2013) is a validated set of nonverbal vocalizations that portray four positive emotions (achievement/triumph, amusement, sensual pleasure, relief) and four negative ones (anger, disgust, fear, sadness). The vocalizations (n =...

Resource Type:Corpus
Media Type:Audio
Language:Portuguese
EmoProsodyPort

EmoProsodyPort (see Castro & Lima, 2010) is a speech database with 368 short sentences and pseudosentences with neutral emotional content. Acoustic measurements and behavioral data.

Resource Type:Corpus
Media Type:Audio
Language:Portuguese

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)