The U-Compare Workbench is a graphical user interface that operates on top of the U-Compare platform. The U-Compare platform allows users to build and evaluate NLP workflows. Workflows consist of one or more components, consisting of corpus readers and tools, such as tokenisers, POS taggers, name...
Web service created by exporting UIMA-based workflow from the U-Compare text mining system. Functionality: Identifies biological named entities and disambiguates them according to species, by assigning a species ID from the NCBI taxonomy. Also identifies sentences and tokens. Tools in workflow...
The GENIA tagger analyzes English sentences and outputs the base forms, part-of-speech tags, chunk tags, and named entity tags. The tagger is specifically tuned for biomedical text such as MEDLINE abstracts.
The present tool, that was built to deal with Portuguese-specific issues concerning a few non-trivial cases that involve tokenization-ambigous strings, segments text into lexically relevant tokens, using whitespace as the separator. Note that, in these examples, the | (vertical bar) symbol is use...
Web service created by exporting UIMA-based workflow from the U-Compare text mining system. Functionality: Identifies biomedical named entities (genes and proteins) in plain text. Also identifies sentences. Tools in workflow: Cafetiere Sentence Splitter (University of Manchester), NEMine (Univ...
Treat is a toolkit for natural language processing and computational linguistics in Ruby. The Treat project aims to build a language- and algorithm- agnostic NLP framework for Ruby with support for tasks such as document retrieval, text chunking, segmentation and tokenization, natural language pa...
LX-NER is a freely available online service for the recognition of expressions for named entities in Portuguese. It was developed and is maintained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics. LX-NER takes a segment of Portuguese text an...
Conta-me Histórias [http://contamehistorias.pt] is a temporal summarization framework of news articles that allows users to explore and revisit events in the past. To select relevant stories of different time-periods, we rely on YAKE! [http://yake.inesctec.pt] a keyword extraction algorithm devel...
MSTParser is a non-projective dependency parser (see McDonald et al., 2005a, 2006) that searches for maximum spanning trees over directed graphs. Models of dependency structure are based on large-margin discriminative training methods (see McDonald et al., 2005b). Projective parsing is also suppo...
LX-Suite is a freely available online service for the shallow processing of Portuguese. It was developed and is maintained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics. LX-Suite is composed by a set of shallow processing tools: - LX Sente...