The EUROPARL Corpus (subpart Portuguese-English of the parallel corpora), available at http://www.statmt.org/europarl/, was extracted from the proceedings of the European Parliament (Koehn, 2005). It contains transcriptions of sessions dating back from 1996 to 2011, in a total of approximately 58...
The HIMERA annotated corpus contains a set of published historical medical documents that have been manually annotated with semantic information that is relevant to the study of medical history and public health. Specifically, annotations correspond to seven different entity types and two differe...
Filter by:
Medical History (1)
Political (1)
Human Use (1)
Annotation (1)
Event Extraction (1)
Lemmatization (1)
Lexicon Access (1)
Pos Tagging (1)
Text Mining (1)
Corpus (2)
Text (2)
Plain text (1)