The EUROPARL Corpus (subpart Portuguese-English of the parallel corpora), available at http://www.statmt.org/europarl/, was extracted from the proceedings of the European Parliament (Koehn, 2005). It contains transcriptions of sessions dating back from 1996 to 2011, in a total of approximately 58...
The resource constitues of a hierarchically-structured system of data types, which is intended to be suitable for describing the inputs and output annotation types of a wide range of natural language processing applications which operate within the UIMA Framework. It is being developed in conjunc...
Filter by:
Written Language (1)
Bilingual (1)
Multilingual (1)
Lemmatization (2)
Pos Tagging (2)
Human Use (1)
Annotation (1)
Event Extraction (1)
Lexicon Access (1)
Parsing (1)
Text Mining (1)
Corpus (1)
Text (2)