Search and Browse – PORTULAN CLARIN

Affect in Tweets PT

This is a data set of Portuguese tweets labeled with the emotion conveyed in the tweet. It was gathered using a methodology similar to the one used for building the Affect in Tweets data set used in the SemEval-2018 Task 1. The data set contains 11219 tweets, each labeled with an emotion (anger,...

Resource Type:	Corpus
Media Type:	Text
Language:	Portuguese

Albertina PT-BR

Albertina PT-* is a foundation, large language model for the Portuguese language. It is an encoder of the BERT family, based on the neural architecture Transformer and developed over the DeBERTa model, and with most competitive performance for this language. It has different versions that were...

Resource Type:	Language Description
Media Type:	Text
Language:	Portuguese

Albertina PT-BR base

Albertina PT-BR base is a foundation, large language model for American Portuguese from Brazil. It is an encoder of the BERT family, based on the neural architecture Transformer and developed over the DeBERTa model, with most competitive performance for this language. It is distributed free of...

Resource Type:	Language Description
Media Type:	Text
Language:	Portuguese

Albertina PT-BR No-brWaC

Resource Type:	Language Description
Media Type:	Text
Language:	Portuguese

Albertina PT-PT

Albertina PT-* is a foundation, large language model for the Portuguese language. It is an encoder of the BERT family, based on the neural architecture Transformer and developed over the DeBERTa model, with most competitive performance for this language. It has different versions that were tra...

Resource Type:	Language Description
Media Type:	Text
Language:	Portuguese

Albertina PT-PT base

Albertina PT-PT base is a foundation, large language model for European Portuguese from Portugal. It is an encoder of the BERT family, based on the neural architecture Transformer and developed over the DeBERTa model, with most competitive performance for this language. It is distributed free ...

Resource Type:	Language Description
Media Type:	Text
Language:	Portuguese

Archivo dos Açores, dir. Ernesto do Canto, 1.ª série, Ponta Delgada, Vol. 1-12

A publicação Arquivo dos Açores, consagrada como obra de referência para a investigação histórica sobre o arquipélago dos Açores, conta com duas séries, num total de 20 volumes. A primeira série do Arquivo dos Açores, composta por 15 volumes, decorreu entre 1878 e 1959, com grandes interrupções r...

Resource Type:	Corpus
Media Type:	Text
Language:	Portuguese

ArgMine Corpus

A corpus of opinion articles annotated with arguments, following a claim-premise model.

Resource Type:	Corpus
Media Type:	Text
Language:	Portuguese

Arquivo Dialetal CLUP - Áudio

Arquivo Dialetal CLUP - Áudio is an audio corpus of spontaneous speech, mainly from Northern Portugal.

Resource Type:	Corpus
Media Type:	Audio
Language:	Portuguese

Arquivo Dialetal CLUP - Orthographic and phonetic transcription

Arquivo Dialetal CLUP - ORTH is a speech corpus approximately with 40 000 tokens (Utterances; spontaneous speech, mainly from Northern Portugal). Orthographic and phonetic transcription.

Resource Type:	Corpus
Media Type:	Audio
Language:	Portuguese

Order by:

Filter by: