The U-Compare Workbench is a graphical user interface that operates on top of the U-Compare platform. The U-Compare platform allows users to build and evaluate NLP workflows. Workflows consist of one or more components, consisting of corpus readers and tools, such as tokenisers, POS taggers, name...
TinySVM is an implementation of Support Vector Machines (SVMs) (Vapnik, 1995; Vapnik, 1998) for the problem of pattern recognition.
Syntactic parser for English. Outputs dependency relations. Also outputs parts-of-speech for each token. The tool is provided as a UIMA component, specifically as Java archive (jar) file, which can be incorporated within any UIMA workflow. However, it is particularly designed use in the U-Com...
This is a UIMA wrapper for the OpenNLP Tokenizer tool. It assigns part-of-speech tags to tokens in English text. The tagset used in from the Penn Treebank). The tool forms part of the in-built library of components provided with the U-Compare platform (Kano et al., 2009; Kano et al., 2011; see se...
This tool performs tokenization of text and assigns all possible morphological analyses to each token. These analyses include the base form of the token, part-of-speech, information about number and gender. The morphological analyser is a module of Apertium machine translation system. The provide...
This tool assigns a part-of-speech tag and base form to each token in a text. It operates on text that has previously been tokenised and morphologically analysed. The POS tagger is a module of Apertium machine translation system. The provided tool can currently operate on a subset of the language...
This tool translates text from a source language into a target language. It operates on text that has previously been tokenised and morphologically analysed, and POS-tagged. Target language tokens are assigned POS tags and morphological analyses. The Apertium Translator is a module of Apertium ma...
MBT is a memory-based tagger-generator and tagger in one. The tagger-generator part can generate a sequence tagger on the basis of a training set of tagged sequences; the tagger part can tag new sequences. MBT can, for instance, be used to generate part-of-speech taggers or chunkers for natural l...
LX-UTagger is a POS tagger for Portuguese that adopts the Universal Part-of-Speech tagset (UPOS), related to the Universal Dependency framework, with an initial performance of 99.06% under a ten-fold cross validation scheme. It is described in this article: António Branco, João Ricardo Silv...
Web service created by exporting UIMA-based workflow from the U-Compare text mining system. Functionality: Identifies co-reference chains in plain text. Also identifies sentences, tokens with parts-of-speech and lemmas, and NP chunks Tools in workflow: TTL-Tokenizer (RACAI, Romania), TTL-Tagger...