HIMERA Corpus

The HIMERA annotated corpus contains a set of published historical medical documents that have been manually annotated with semantic information that is relevant to the study of medical history and public health. Specifically, annotations correspond to seven different entity types and two differe...

Resource Type:Corpus
Media Type:Text
Language:English
U-Compare Type system

The resource constitues of a hierarchically-structured system of data types, which is intended to be suitable for describing the inputs and output annotation types of a wide range of natural language processing applications which operate within the UIMA Framework. It is being developed in conjunc...

Resource Type:Language Description
Media Type:Text
Language:English
CW Corpus

The Complex Word (CW) Corpus contains 731 sentences each with one annotated CW. These simplifications were mined from Simple Wikipedia edit histories. Each entry gives an example of a sentence requiring simplification by means of a single lexical edit. This resource is primarily designed for t...

Resource Type:Corpus
Media Type:Text
Language:English
PhenoCHF Corpus

PhenoCHF is an annotated corpus consisting of documents belonging to two different text types (i.e., narrative reports from electronic health records (EHRs) and literature articles). It is manually annotated by medical doctors with detailed information relating to mentions of phenotype concepts a...

Resource Type:Corpus
Media Type:Text
Language:English
U-Compare Syntactic Parsing Service

Web service created by exporting UIMA-based workflow from the U-Compare text mining system. Functionality: Carries out syntactic parsing on plain text Tools in workflow: Cafetiere Sentence Splitter (University of Manchester), OpenNLP Tokenizer (Apache), STEPP Tagger (University of Manchester), ...

Resource Type:Tool / Service
Language:English
UIMA/U-Compare Enju parser

Syntactic parser for English. Outputs predicate-argument structures. Also outputs base forms for each token. The tool is provided as a UIMA component, which forms part of the in-built library of components provided with the U-Compare platform (see separate META-SHARE record) for building and...

Resource Type:Tool / Service
Language:English
U-Compare Species Disambiguation Service

Web service created by exporting UIMA-based workflow from the U-Compare text mining system. Functionality: Identifies biological named entities and disambiguates them according to species, by assigning a species ID from the NCBI taxonomy. Also identifies sentences and tokens. Tools in workflow...

Resource Type:Tool / Service
Language:English
GENIA Tagger

The GENIA tagger analyzes English sentences and outputs the base forms, part-of-speech tags, chunk tags, and named entity tags. The tagger is specifically tuned for biomedical text such as MEDLINE abstracts.

Resource Type:Tool / Service
Language:English
U-Compare Named Entity Recognition service

Web service created by exporting UIMA-based workflow from the U-Compare text mining system. Functionality: Identifies biomedical named entities (genes and proteins) in plain text. Also identifies sentences. Tools in workflow: Cafetiere Sentence Splitter (University of Manchester), NEMine (Univ...

Resource Type:Tool / Service
Language:English
SemLink

SemLink is a project whose aim is to link together different lexical resources via a set of mappings. These mappings will make it possible to combine the different information provided by these different lexical resources for tasks such as inferencing. In the current release, two mappings are ava...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:English

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)