CINTIL-Corpus Internacional do Português is a linguistically interpreted corpus of Portuguese. At present it is composed of 1 Million annotated tokens, verified by human expert annotators. The annotation comprises information on part-of-speech, open classes lemma and inflection, multi-word expres...
The EUROPARL Corpus (subpart Portuguese-English of the parallel corpora), available at http://www.statmt.org/europarl/, was extracted from the proceedings of the European Parliament (Koehn, 2005). It contains transcriptions of sessions dating back from 1996 to 2011, in a total of approximately 58...
Filter by:
Portuguese (22)
English (1)
Human Use (9)
Pos Tagging (7)
Lemmatization (6)
Lexicon Access (6)
Parsing (4)
Other (3)
Annotation (1)
Semantic Web (1)
Speech Analysis (1)
Text Mining (1)
Web Services (1)
Corpus (15)