Based on the MXPOST part of speech tagger and UNITEX dictionaries for Portuguese, this tool produces the lemmas of the words of a text stored in a plain text file. The source code is also provided.
A NER-classifier based on memory-based learning, trained on the CINTIL dataset, a corpus that contains part of the Corpus de Referência do Português Contemporâneo - CRPC (Reference Corpus of Contemporary Portuguese). https://portulanclarin.net/repository/browse/cintil-corpus-internacional-do-por...
Filter by:
Tool Service (12)
Web service (1)
Text (12)
Text (12)
Plain text (1)
Application/pdf (1)
Application/rtf (1)
Application/xml (1)
Text/html (1)
Text/plain (1)