The corpus consists of 1000 MEDLINE abstracts. It is a subset of the original GENIA POS & term corpus, which was selected using the three MeSH terms human, blood cells and transcription factors. In each sentence, three types of information are annotated 1) biomedical terms are identified and assi...
The CINTIL-DependencyBank (Branco et al., 2011a) is a corpus of grammatical dependencies of Portuguese texts composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 sentences; 3,082 tokens) (see 3.2.). In additi...
Filter by:
Portugal (3)
Written Language (5)
Other (3)
Human Use (2)
Parsing (6)
Event Extraction (3)
Text Mining (3)
Annotation (2)
Lemmatization (2)
Pos Tagging (2)
Other (1)
Text (11)