Text corpus for bilingual concordancing, single- and multi-word translation extraction, machine translation. Languages: cs-pt, de-pt, en-pt, es-pt, fr-pt, it-pt, and pt-sk. Size: 1 G per language (phrases aligned). Domain: Law and Health.
Porlex (Gomes & Castro, 2003) is a lexical database that includes written and phonetic transcription of standard adult vocabulary - 44 psycholinguistic characteristics (e.g. orthographic, phonological, phonetic, part-of-speech, and neighborhood characteristics). For each word it contains psychol...
Parallel corpora is a set of parallel texts in the domain of Law and Health, with 1 G per language. Languages: cs-pt, de-pt, en-pt, es-pt, fr-pt, it-pt, and pt-sk.
EMOTAIX.PT (Costa, 2012) is a database of 3,983 emotional words (nouns, verbs, adjectives and adverbs) in European Portuguese based on the original EMOTAIX in French (Piolat & Bannour, 2009). Each word is classified into three hierarchical levels: Supra Category, Super Category and Basic Category...
Port-AoA Words (Cameirão & Vicente, 2010) is a lexical database containing 7 psycholinguistic characteristics (e.g. neighborhood density, written-word frequency, familiarity, imageability, etc). Standard adult vocabulary.
The corpus consists of 1000 MEDLINE abstracts. It is a subset of the original GENIA POS & term corpus, which was selected using the three MeSH terms human, blood cells and transcription factors. In each sentence, three types of information are annotated 1) biomedical terms are identified and assi...
Hontology (H stands for hotel, hostal and hostel) (available at http://ontolp.inf.pucrs.br/Recursos/downloads-Hontology.php) is a new multilingual ontology for the accommodation sector freely available, containing 282 concepts categorized into 16 top-level concepts. The concepts of other voca...