The EUROPARL Corpus (subpart Portuguese-English of the parallel corpora), available at http://www.statmt.org/europarl/, was extracted from the proceedings of the European Parliament (Koehn, 2005). It contains transcriptions of sessions dating back from 1996 to 2011, in a total of approximately 58...
The corpus presented here is a collection of several tutorials and scientific papers in the field of Information Technology with 603 annotated definitions from Portuguese. The texts were collected from the Web at the beginning of the 2006 and they are organised in 32 files of three different sub-...
Filter by:
Human Use (1)
Lemmatization (1)
Lexicon Access (1)
Pos Tagging (1)
Semantic Web (1)
Text Mining (1)
Web Services (1)
Corpus (2)
Text (2)
Text/xml (1)