The CINTIL-TreeBank (Branco et al., 2011) is a corpus of syntactic constituency trees of Portuguese texts composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 sentences; 3,082 tokens). In addition, there are ...
The PTPARL Corpus contains approximately 975,806 running words of European Portuguese. It includes 1076 texts consisting of adapted transcriptions of the Portuguese parliament sessions, which were made available in 2004.
Filter by:
Portuguese (22)
English (1)
Human Use (9)
Pos Tagging (7)
Lemmatization (6)
Lexicon Access (6)
Parsing (4)
Other (3)
Annotation (1)
Semantic Web (1)
Speech Analysis (1)
Text Mining (1)
Web Services (1)
Corpus (15)