Search and Browse – PORTULAN CLARIN

MARv4

MARv-POS is a part-of-speech tagger tool (probabilistic POS annotation module). MARv4's architecture comprehends two submodules: a set of linguistically-oriented disambiguation rules module and a probabilistic disambiguation module. The linguistic-oriented is no longer used in the STRING chain be...

Resource Type:	Tool / Service
Language:	Portuguese

RudriCo-TOK

RudriCo-TOK is a tokenizer tool that splits contractions. De-contraction rules: 178.

Resource Type:	Tool / Service
Language:	Portuguese

Lexicon of discourse markers for European Portuguese

The lexicon of discourse markers for European Portuguese contains 252 pairs of discourse marker/rhetorical sense. The lexicon covers conjunctions, prepositions, adverbs, adverbial phrases and alternative lexicalizations with a connective function, as in the PDTB (Prasad et al., 2008; Prasad et al...

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

U-Compare Tokenisation service

Web service created by exporting UIMA-based workflow from the U-Compare text mining system. Functionality: Identifies sentences and tokens in plain text. Tools in workflow: Freeling sentence splitter web service (service provided by the PANACEA project), LX-Tokenizer (web service provided by th...

Resource Type:	Tool / Service
Language:	Portuguese

RudriCo-POS

RudriCo-POS is a part-of-speech disambiguation tool that performs 188 morphological disambiguation rules.

Resource Type:	Tool / Service
Language:	Portuguese

MARv-DISAMB

MARv-DISAMB is a part-of-speech disambiguation tool (probabilistic disambiguation module).

Resource Type:	Tool / Service
Language:	Portuguese

Grafone-LEX

Grafone-LEX is a lexical database for conversion from graphemes to phonemes

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

Maltese Wordlist

Wordlist for spell-checking

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Maltese

U-Compare Cafetiere English Sentence Detector

The purpose of the tool is to detect sentence boundaries in English text. The tool is provided as a UIMA component, specifically as Java archive (jar) file, which can be incorporated within any UIMA workflow. However, it is particularly designed use in the U-Compare text mining platform (see sepa...

Resource Type:	Tool / Service
Language:	English

Termcat Digital Marketing

Terms for Digital Marketing

Resource Type:	Lexical / Conceptual
Media Type:	Text
Languages:	Catalan; Valencian
	English
	French
	Galician
	German
	Italian
	Portuguese
	Spanish; Castilian

Order by:

Filter by: