DVPM-SynSem

DVPM-SynSem is a lexical database with syntactic and semantic information in Medieval Portuguese. It contains around 3000 verbs.

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
Grafone-LEX

Grafone-LEX is a lexical database for conversion from graphemes to phonemes

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
LX-WordSim-353

The LX-WordSim-353 was created from WordSim-353 (Agirre et al., 2009). As the name suggests, this data set contains 353 pairs of words. Both words in each pair can have different morphosyntactic categories. The data set is made of nouns, adjectives, verbs and named entities, and has no multiwords...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
LX-SimLex-999

The LX-SimLex-999 was created from SimLex-999 (Hill et al., 2015) which, in turn, was based in the University of South Florida Free Association Database (USF) (Nelson et al., 2014). There were strict guidelines to create SimLex-999. Both words in each pair have the same morphosyntactic category ...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
LX-Rare Word Similarity Dataset

The LX-Rare Word Similarity Data set was created from Stanford Rare Word (RW) Similarity data set (Luong et al., 2013). This list contains 2 034 words (1 017 pairs of words). All the words were extracted from Wikipedia and from WordNet (Miller, 1995), a lexical database where the concepts are gro...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
PicName

PicName (see Castro et al., 1997, 1999; Gomes et al., 2006; Neves et al., 1995) is a picture-naming task that can be used to collect spontaneous speech samples and to measure articulation abilities in Portuguese-speaking children. It is an updated version of the Sounds-in-Words task included in t...

Resource Type:Lexical / Conceptual
Media Types:Text
Image
Language:Portuguese
Dicionário de Gentílicos e Topónimos

Dicionário de Gentílicos e Topónimos is a list of pairs of toponyms and demonyms. The toponyms and demonyms included have a morphologically compositional relation between each other. The list contains around 1500 such pairs and additionally provides information on the toponym referent (upper unit...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
EMOTAIX.PT

EMOTAIX.PT (Costa, 2012) is a database of 3,983 emotional words (nouns, verbs, adjectives and adverbs) in European Portuguese based on the original EMOTAIX in French (Piolat & Bannour, 2009). Each word is classified into three hierarchical levels: Supra Category, Super Category and Basic Category...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
Portulex

Portulex is a lexical database in European Portuguese that contains words from reading texts in children’s schoolbooks for reading and language instruction in Grades 1 to 4. It comprises a wordform and a lemma database. The wordform database consists of 17,062 inflected wordforms, and the lemma d...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
Gervásio PT-PT base

Gervásio PT-* is a foundation, large language model for the Portuguese language. It is a decoder of the GPT family, based on the neural architecture Transformer and developed over the Pythia model, with competitive performance for this language. It has different versions that were trained for ...

Resource Type:Language Description
Media Type:Text
Language:Portuguese

Order by:

Filter by:

Text (445)
Audio (18)
Image (1)