Part-of-speech tagger tuned to biomedical text, provided as a web service.
Technical Description: http://qtleap.eu/wp-content/uploads/2015/05/Pilot1_technical_description.pdf http://qtleap.eu/wp-content/uploads/2015/05/TechnicalDescriptionPilot2_D2.7.pdf http://qtleap.eu/wp-content/uploads/2016/11/TechnicalDescriptionPilot3_D2.10.pdf
SENTER is a SENtence splitTER for Portuguese.
SenseClusters is a package of (mostly) Perl programs that allows a user to cluster similar contexts together using unsupervised knowledge-lean methods.
RudriCo-TOK is a tokenizer tool that splits contractions. De-contraction rules: 178.
RudriCo-POS is a part-of-speech disambiguation tool that performs 188 morphological disambiguation rules.
Reddit Dataset Extraction Tool (RDET) is a tool that takes advantage of the resources available at 'pushshift.io' that relate to Reddit comments and submissions and generates new datasets based on any given subreddit.
Technical Description: http://qtleap.eu/wp-content/uploads/2015/05/Pilot1_technical_description.pdf http://qtleap.eu/wp-content/uploads/2015/05/TechnicalDescriptionPilot2_D2.7.pdf http://qtleap.eu/wp-content/uploads/2016/11/TechnicalDescriptionPilot3_D2.10.pdf
The OntoLP system is a plug-in for the construction environment of the ontologies Protégé. The plug-in intents to be an assistant for the engineer of ontologies for Portuguese during the execution of initial steps concerning the ontologies construction: extraction of terms which are candidates fo...
MSTParser is a non-projective dependency parser (see McDonald et al., 2005a, 2006) that searches for maximum spanning trees over directed graphs. Models of dependency structure are based on large-margin discriminative training methods (see McDonald et al., 2005b). Projective parsing is also suppo...