UIMA/U-Compare Apertium POS Tagger

Handle:	https://hdl.handle.net/21.11129/0000-000B-D301-5 (persistent URL to this page)
URL:	http://nactem.ac.uk/ucompare/
URL:	http://www.apertium.org/

This tool assigns a part-of-speech tag and base form to each token in a text. It operates on text that has previously been tokenised and morphologically analysed. The POS tagger is a module of Apertium machine translation system. The provided tool can currently operate on a subset of the languages that are supported by the Apertium system, namely: English, Spanish, Calatan, Galician, Portuguese, Romanian and Basque.
NOTE: The morphological analysis required prior to running the POS tagger MUST be carried out by running the Apertium morphological analyser (which also performs tokeniaation).

The tool is provided as a UIMA component, specifically as Java archive (jar) file, which can be incorporated within any UIMA workflow. However, it is particularly designed use in the U-Compare text mining plaform (Kano et al., 2009; Kano et al., 2011; see separate META-SHARE record), since the types of annotations it produces are compliant with the U-Compare.

Download

DistributionLicence

GPL

Licensors:

University of Manchester

School of Computer Science

University of Manchester

[javascript protected email address]

Distribution rights holders:

University of Manchester

School of Computer Science

University of Manchester

[javascript protected email address]

IPR Holder

University of Manchester

School of Computer Science

University of Manchester

[javascript protected email address]

Contact Person

Sophia Ananiadou

University of Manchester

Professor

[javascript protected email address]

School of Computer Science

[javascript protected email address]

Tool/Service

Tool

Language Dependent

Input

Media type: Text

Resource type: Corpus

Modality: Written Language

Language: English, Spanish, Portuguese, Catalan, Galician, Basque

Annotation type: Structural Annotation

Segmentation level: Word

Output

Media type: Text

Resource type: Corpus

Modality: Written Language

Language: English, Spanish, Portuguese, Galician, Catalan, Basque

Annotation type: Morphosyntactic Annotation - Pos Tagging

Segmentation level: Word

Operation

Operating system: Os - Independent

Metadata

Created: 06/25/2012

Last Updated: 02/15/2013

Metadata Creator

Paul Thompson

University of Manchester

Research Associate

[javascript protected email address]

School of Computer Science

[javascript protected email address]

Usage

Access tools

U-Compare Workbench

Associated resources

U-Compare Workbench

Foreseen UseNlp Applications

Use NLP Specific: Morphological Analysis, Pos Tagging

Actual Use - Nlp Applications

Use NLP Specific: Morphological Analysis, Pos Tagging

Documentation

Document Type: Other

Paul Thompson, UIMA/U-Compare Apertium POS Tagger , http://www.nactem.ac.uk/meta-net/Narratives/ApertiumPOS.pdf

People who looked at this resource also viewed the following:

People who downloaded this resource also downloaded the following: