PS corpus (Post-Scriptum) - treebank is a treebank corpus of 586 informal mail letters written in Portuguese and Spanish during the Modern Ages (from the XVIth century to the beginning of the XIXth century). This treebank is a syntactically annotated subset of the Portuguese "PS corpus (Post-S...
This set of materials resulted from a study on the processing of explicit pronouns in European Portuguese. A spreadsheet containing data from 75 participants (young adults), namely, per-word reading times and accuracy data on comprehension questions, is provided. Complementary materials (Read Fir...
Grafone-Tool is a tool for conversion from grapheme to phoneme for European Portuguese. The converter works with the Portuguese spelling, both prior to and after the Orthographic Agreement of 1990.
Hesita-POS is an annotaded corpus. Tv News.
CINTIL-USuite is a corpus of Portuguese that is annotated with lemmas, the Universal Part-of-Speech tagset (UPOS) and Universal feature bundles, related to the Universal Dependency framework, and that contains around 1 million annotated tokens. It is described in this article: António Branc...
The DEEB Corpus contains the transcriptions of 1200 narrative texts written by pupils in their 4th, 6th and 9th year Portuguese Language exams in the public school system in Portugal.
Corpus with the transcriptions of syllogistic reasoning protocols. Written transcriptions: Verbal data (30 hours) elicited during an experiment on syllogistic reasoning (each of 27 participants x the 64 syllogistic problems): Thinking aloud task; reflexive conversation Performance data: La...
A NER-classifier based on memory-based learning, trained on the CINTIL dataset, a corpus that contains part of the Corpus de Referência do Português Contemporâneo - CRPC (Reference Corpus of Contemporary Portuguese). https://portulanclarin.net/repository/browse/cintil-corpus-internacional-do-por...