Search and Browse – PORTULAN CLARIN

AuCoPro - Splitting

The AuCoPro-Splitting dataset contains compounds annotated with their compound boundaries and linking morphemes. The dataset consists of two files, one for Afrikaans and one for Dutch. The annotation was performed according to annotation guidelines as described in Verhoeven, van Zaanen, van Huyss...

Resource Type:	Lexical / Conceptual
Media Type:	Text
Languages:	Afrikaans
Languages:	Dutch; Flemish

Port-AoA Words

Port-AoA Words (Cameirão & Vicente, 2010) is a lexical database containing 7 psycholinguistic characteristics (e.g. neighborhood density, written-word frequency, familiarity, imageability, etc). Standard adult vocabulary.

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

DVPM-EtyMor

DVPM-EtyMor is a lexical database. Etymological, morphological and textual exemplification. Around 3000 verbs. Language: Medieval portuguese.

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

DVPM-SynSem

DVPM-SynSem is a lexical database with syntactic and semantic information in Medieval Portuguese. It contains around 3000 verbs.

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

Grafone-LEX

Grafone-LEX is a lexical database for conversion from graphemes to phonemes

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

Time-sensitive inventory of medical terminology

This inventory contains a set of terms that are relevant to the study of medical history. The inventory is organised as a set of "heading terms", belonging to one of seven different semantic categories, each of which is accompanied by a set of semantically-related terms. There are around 175,0...

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	English

MWN.PT - WordNet of Portuguese

A wordnet is a lexical database. It groups synonymous words into sets, the synsets, which represent distinct concepts. These synsets form nodes in a network, which are interlinked through edges that correspond to semantic relations between those synsets. For instance, the hypernym relation, also ...

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

ViPER verb lexical database

ViPER is a verb lexical database with +7,000 verb senses, along with their structural, distributional, and transformational properties. The verb senses are classified based on the main syntactic properties of their construction. Around 70 formal classes have been devised. For each verb sense, its...

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

Arabic Tweets NER test set

Despite many recent papers on Arabic Named Entity Recognition (NER) in the news domain, little work has been done on microblog NER. NER on microblogs presents many complications such as informality of language, shortened named entities, brevity of expressions, and inconsistent capitalization (for...

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Arabic

LX-Abbreviations

LX-Abbreviations resource is a collection of abbreviations of different types from European Portuguese composed by 208 words. Each type of abbreviation is manually divided and annotated with grammatical categories, gender and number, and, finally, with the respective abbreviations.

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

Order by:

Filter by: