Memoria de traducción Portal oficial de turismo de España www.spain.info
Terms from different sciences and industries - ecology, economy, law, sociology, medecine, tourism and computation.
The SIMPLE Portuguese Lexicon is constituted by 10,438 entries semantically encoded, accordingly to the parole common encoding standards.
Hesita-POS is an annotaded corpus. Tv News.
This set of materials pertains to a study on the production of explicit pronouns, null pronouns, and repeated-NP anaphors, in European Portuguese. A spreadsheet containing data from 73 participants (young adults), namely, count data for instances of the different types of anaphor that occurred in...
The PTPARL Corpus contains approximately 975,806 running words of European Portuguese. It includes 1076 texts consisting of adapted transcriptions of the Portuguese parliament sessions, which were made available in 2004.
This is a corpus for multi-document summarization for European Portuguese. It contains 80 topics, each of which has 10 documents, for a total of 800 documents. Each topic contains two human summaries. The summaries are compressive: they are the result of a compression of the sentences in the orig...
The DEEB Corpus contains the transcriptions of 1200 narrative texts written by pupils in their 4th, 6th and 9th year Portuguese Language exams in the public school system in Portugal.
RudriCo-POS is a part-of-speech disambiguation tool that performs 188 morphological disambiguation rules.
This dataset is a collection of dialogues extracted from the Portugal subreddit with RDET (Reddit Dataset Extraction Tool). It is composed of around 58,964,715 tokens in 218,550 dialogues.