The resource is constituted by 20 thousand entries morpho-syntactically and syntactically encoded, accordingly to the parole common encoding standards.
The lexicon of discourse markers for European Portuguese contains 252 pairs of discourse marker/rhetorical sense. The lexicon covers conjunctions, prepositions, adverbs, adverbial phrases and alternative lexicalizations with a connective function, as in the PDTB (Prasad et al., 2008; Prasad et al...
The CINTIL-NamedEntities corpus, built upon the CINTIL International Corpus of Portuguese (Barreto et al., 2006), is composed of 30,493 sentences of written Portuguese with named entities manually disambiguated and annotated with links to appropriate pages in the Portuguese Dbpedia (Lehmann et al...
CINTIL-Corpus Internacional do Português is a linguistically interpreted corpus of Portuguese. At present it is composed of 1 Million annotated tokens, verified by human expert annotators. The annotation comprises information on part-of-speech, open classes lemma and inflection, multi-word expres...
The SIMPLE Portuguese Lexicon is constituted by 10,438 entries semantically encoded, accordingly to the parole common encoding standards.
FLY Corpus is a corpus composed by 2000 informal letters written in Portuguese, in the years spanning from 1900 to 1974, in the context of war, migration, imprisonment and exile. Each letter is in an XML file with two main parts: (a) the header, which contains metadata about the document (the ...
FLY Corpus is a corpus composed by 2000 informal letters written in Portuguese, in the years spanning from 1900 to 1974, in the context of war, migration, imprisonment and exile. Each letter is in an XML file with two main parts: (a) the header, which contains metadata about the document (the ...
The CRPC Discourse Bank is labeled for discourse relations (also referred to as rhetorical relations or coher- ence relations), such as cause and condition, that hold between two spans of text and contribute to ensure the overall cohesion and coherence of the text. The scheme follows the principl...
PS Corpus (Post-Scriptum)-PT is a corpus of 2215 informal mail letters written in Portuguese during the Modern Ages (from the XVIth century to the beginning of the XIXth century). Each letter is available as a semi-palaeographic transcription, a modernized transcription, and with part-of-speec...
Royal inquiries of 1258 (primarily published in the Portugaliae Monumenta Historica).