GREC

GREC is a semantically annotated corpus of 240 MEDLINE abstracts (167 on the subject of E. coli species and 73 on the subject of the Human species) which is intended for training IE systems and/or resources which are used to extract events from biomedical literature.

Resource Type:Corpus
Media Type:Text
Language:English
SemLink

SemLink is a project whose aim is to link together different lexical resources via a set of mappings. These mappings will make it possible to combine the different information provided by these different lexical resources for tasks such as inferencing. In the current release, two mappings are ava...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:English
BioLexicon

The BioLexicon is a large-scale, wide-coverage computational lexicon covering the biomedical domain. A large part of the lexicon is concerned with covering biomedical terms and their variants. Entries for domain-specific verbs include syntactic and semantic information. The lexicon includes entri...

Resource Type:Corpus
Media Type:Text
Language:English
GENIA POS & Term Corpus

A corpus of 2,000 MEDLINE abstracts, collected using the three MeSH terms human, blood cells and transcription factors. The corpus is available in three formats: 1) A text file containing part-of-speech (POS) annotation, based on the Penn Treebank format, 2) An XML file containing inline POS anno...

Resource Type:Corpus
Media Type:Text
Language:English
Laws of Malta - English

The corpus contains the Laws of Malta in English from the official government website. The unannotated raw text files were extracted from the pdf files that can be found on the website.

Resource Type:Corpus
Media Type:Text
Language:English
GENIA Event Corpus with meta-knowledge annotation

The corpus consists of 1000 MEDLINE abstracts. It is a subset of the original GENIA POS & term corpus, which was selected using the three MeSH terms human, blood cells and transcription factors. In each sentence, three types of information are annotated 1) biomedical terms are identified and assi...

Resource Type:Corpus
Media Type:Text
Language:English
English Acquis Communautaire

This is the English version of the Acquis Communautaire (AC), which is the total body of European Union (EU) law applicable in the EU Member States. It consists of selected texts between the 1950s and today.

Resource Type:Corpus
Media Type:Text
Language:English
CW Corpus

The Complex Word (CW) Corpus contains 731 sentences each with one annotated CW. These simplifications were mined from Simple Wikipedia edit histories. Each entry gives an example of a sentence requiring simplification by means of a single lexical edit. This resource is primarily designed for t...

Resource Type:Corpus
Media Type:Text
Language:English
HIMERA Corpus

The HIMERA annotated corpus contains a set of published historical medical documents that have been manually annotated with semantic information that is relevant to the study of medical history and public health. Specifically, annotations correspond to seven different entity types and two differe...

Resource Type:Corpus
Media Type:Text
Language:English
Time-sensitive inventory of medical terminology

This inventory contains a set of terms that are relevant to the study of medical history. The inventory is organised as a set of "heading terms", belonging to one of seven different semantic categories, each of which is accompanied by a set of semantically-related terms. There are around 175,0...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:English

Order by:

Filter by: