U-Compare Cafetiere English Sentence Detector

The purpose of the tool is to detect sentence boundaries in English text. The tool is provided as a UIMA component, specifically as Java archive (jar) file, which can be incorporated within any UIMA workflow. However, it is particularly designed use in the U-Compare text mining platform (see sepa...

Resource Type:Tool / Service
Language:English
Blacklist Classifier

A language identifier for closely related languages.

Resource Type:Tool / Service
Languages:Bosnian
Croatian
Czech
Portuguese
Serbian
Slovak
LX-Service

LXService is a Web Service that consists in a range of tools for Portuguese that have been develop for the processing of Portuguese. They were selected because they satisfy a number of features that are likely to make them more suitable for initial experimentation: They are fast, robust, the ling...

Resource Type:Tool / Service
DVPM-browser

DVPM-browser is a browser for the DVPM lexical database of medieval Portuguese.

Resource Type:Tool / Service
CIPM-browser

CIPM-browser is a browser for the CIPM corpus, a corpus of medieval Portuguese.

Resource Type:Tool / Service
UIMA/U-Compare GENIA Tagger

The GENIA tagger analyzes English sentences and outputs the base forms, part-of-speech tags, chunk tags, and named entity tags. The tagger is specifically tuned for biomedical text such as MEDLINE abstracts. The tool is provided as a UIMA component, which forms part of the in-built library of...

Resource Type:Tool / Service
Language:English
UIMA/U-Compare Enju parser

Syntactic parser for English. Outputs predicate-argument structures. Also outputs base forms for each token. The tool is provided as a UIMA component, which forms part of the in-built library of components provided with the U-Compare platform (see separate META-SHARE record) for building and...

Resource Type:Tool / Service
Language:English
UIMA/U-Compare GENIA Sentence Detector

The purpose of the tool is to detect sentence boundaries in English text. It is trained on the GENIA corpus of biomedical abstracts and so is particularly suitable for splitting sentences in biomedical texts. The tool is provided as a UIMA component, which forms part of the in-built library of co...

Resource Type:Tool / Service
Language:English
UIMA/U-Compare NEMine

The purpose of the tool is to identify gene and protein names in biomedical text. The tool is provided as a UIMA component, which forms part of the in-built library of components provided with the U-Compare platform for building and evaluating text mining workflows. The U-Compare Workbench pr...

Resource Type:Tool / Service
Language:English
UIMA/U-Compare GENIA Tokeniser (GENIA Tagger)

Tokenisation is one of the functionalities of the GENIA tagger, which additionally outputs the base forms, part-of-speech tags, chunk tags, and named entity tags. The tagger is specifically tuned for biomedical text such as MEDLINE abstracts. The tool is a UIMA component, which forms part of th...

Resource Type:Tool / Service
Language:English

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)