We are creating a large scale, freely available, semantic dictionary of Mandarin Chinese: the Chinese Open Wordnet, inspired by the Princeton WordNet and the Global WordNet Grid. All relations (hypernyms, meronyms ...) come from Princeton WordNet 3.0. We have enriched the synsets with Chinese lex...
DVPM-EtyMor is a lexical database. Etymological, morphological and textual exemplification. Around 3000 verbs. Language: Medieval portuguese.
DVPM-SynSem is a lexical database with syntactic and semantic information in Medieval Portuguese. It contains around 3000 verbs.
EMOTAIX.PT (Costa, 2012) is a database of 3,983 emotional words (nouns, verbs, adjectives and adverbs) in European Portuguese based on the original EMOTAIX in French (Piolat & Bannour, 2009). Each word is classified into three hierarchical levels: Supra Category, Super Category and Basic Category...
The lexicon of discourse markers for European Portuguese contains 252 pairs of discourse marker/rhetorical sense. The lexicon covers conjunctions, prepositions, adverbs, adverbial phrases and alternative lexicalizations with a connective function, as in the PDTB (Prasad et al., 2008; Prasad et al...
This lexicon includes multiword expressions (MWE) of European Portuguese extracted from a balanced 50,8M word written corpus – a subcorpus of the Reference Corpus of Contemporary Portuguese (CRPC). This corpus covers different genres, being mainly constituted by journalistic texts (59%), but it a...
This lexicon is a speech lexicon, exported from Crimsonwing’s text-to-speech (TTS) database into a .txt file. In its original form and together with the Maltese Speech Engine Diphone repository, it was used for building Crimsonwing’s text-to-speech system. The file is in txt format, with each ...
The resource consists of a Portuguese frequency lexicon based on a 16 million words corpus of written and spoken texts from different genres. The lexicon contains 26.443 entries (lemma) and 140
The Ontology for the area of Nanoscience and Nanotechnology (Ontologia para a área de Nanociência e Nanotecnologia) is constituted by 511 terms of this field of knowledge. It was extracted from a corpus collected from the Web, with a total of 2.570.792 words
The resource is constituted by 20 thousand entries morpho-syntactically and syntactically encoded, accordingly to the parole common encoding standards.