PicName

PicName (see Castro et al., 1997, 1999; Gomes et al., 2006; Neves et al., 1995) is a picture-naming task that can be used to collect spontaneous speech samples and to measure articulation abilities in Portuguese-speaking children. It is an updated version of the Sounds-in-Words task included in t...

Resource Type:Lexical / Conceptual
Media Types:Text
Image
Language:Portuguese
Thesaurus for Portuguese - version 2.0

TeP 2.0 is a wordnet-like semantic resource for the Brazilian Portuguese language. It includes the words of the language and the synonym and antonym relations that happen among them.

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
Embeddings for Comparative Probing of Lexical Semantics Theories

Embeddings used in: Branco, António, João Rodrigues, Małgorzata Salawa, Ruben Branco and Chakaveh Saedi, 2020. Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness. In Proceedings of the International Conference on Computational Linguistics (C...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
LX-UDParser

LX-UDParser is a UD parser for Portuguese, which adopts the Universal Dependency framework, with an initial performance of 90.87 for UAS and 88.01 for LAS under a ten-fold cross validation scheme. It is described in this article: António Branco, João Ricardo Silva, Luís Gomes and João Rodri...

Resource Type:Tool / Service
Language:Portuguese
Arquivo Dialetal CLUP - Orthographic and phonetic transcription

Arquivo Dialetal CLUP - ORTH is a speech corpus approximately with 40 000 tokens (Utterances; spontaneous speech, mainly from Northern Portugal). Orthographic and phonetic transcription.

Resource Type:Corpus
Media Type:Audio
Language:Portuguese
Arquivo Dialetal CLUP - Áudio

Arquivo Dialetal CLUP - Áudio is an audio corpus of spontaneous speech, mainly from Northern Portugal.

Resource Type:Corpus
Media Type:Audio
Language:Portuguese
LX-SimLex-999

The LX-SimLex-999 was created from SimLex-999 (Hill et al., 2015) which, in turn, was based in the University of South Florida Free Association Database (USF) (Nelson et al., 2014). There were strict guidelines to create SimLex-999. Both words in each pair have the same morphosyntactic category ...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Albertina PT-PT

Albertina PT-* is a foundation, large language model for the Portuguese language. It is an encoder of the BERT family, based on the neural architecture Transformer and developed over the DeBERTa model, with most competitive performance for this language. It has different versions that were tra...

Resource Type:Language Description
Media Type:Text
Language:Portuguese
LexMan-POSTagger

LexMan-POSTagger is a morphological analyser tool that morphologically tags all words. Size: Lemmas verbs: 12 995; Lemmas nouns and adj: 38 180; Lemmas adverbs: 7 250; Compound words: 35 201. Language: Portuguese.

Resource Type:Tool / Service
Language:Portuguese
Spoken Portuguese - Geographical and Social Varieties

This resource includes a spoken Portuguese corpus exemplifying the Portuguese spoken in Portugal, Brazil, Angola, Cape Verde, Guinea-Bissau, Mozambique, Sao Tome and Principe, Macao, Goa and East-Timor - with aligned sound and orthographic transcription - collected among sociolinguistically diver...

Resource Type:Corpus
Media Types:Text
Audio
Language:Portuguese

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)