The EUROPARL Corpus (subpart Portuguese-English of the parallel corpora), available at http://www.statmt.org/europarl/, was extracted from the proceedings of the European Parliament (Koehn, 2005). It contains transcriptions of sessions dating back from 1996 to 2011, in a total of approximately 58...
CINTIL-Corpus Internacional do Português is a linguistically interpreted corpus of Portuguese. At present it is composed of 1 Million annotated tokens, verified by human expert annotators. The annotation comprises information on part-of-speech, open classes lemma and inflection, multi-word expres...
Filter by:
Pos Tagging (22)
Human Use (4)
Text Mining (10)
Lemmatization (7)
Parsing (6)
Lexicon Access (4)
Annotation (3)
Other (2)
Web Services (2)
Event Extraction (1)
Text (22)
Spoken register (1)
Writen register (1)
Text numerical (1)