LX-NER is a freely available online service for the recognition of expressions for named entities in Portuguese. It was developed and is maintained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics. LX-NER takes a segment of Portuguese text an...
LX-Suite is a freely available online service for the shallow processing of Portuguese. It was developed and is maintained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics. LX-Suite is composed by a set of shallow processing tools: - LX Sente...
LX-DepParser is a free online service for the syntactic analysis of Portuguese. It allows the automatic parsing of sentences in Portuguese in terms of the grammatical functions of their words. This service was developed and is maintained at the University of Lisbon by the NLX-Speech and Natural ...
LX-Proficiency is an online service for the quantitative analysis of texts along a range of linguistic metrics, and for the estimation of the proficiency level of texts. These quantitative metrics are meant to provide support in the classification of texts according to the proficiency levels i...
LX-Quantitative is an online service for the quantitative analysis of texts along a range of linguistic metrics. This service is based on automatic text processing tools. Hence, the results it returns may not be always totally correct. Its high accuracy rate tough allows it to provide a useful...
Reddit Dataset Extraction Tool (RDET) is a tool that takes advantage of the resources available at 'pushshift.io' that relate to Reddit comments and submissions and generates new datasets based on any given subreddit.
LX-Sentence Splitter is a language processing tool for delimiting sentences in Portuguese. It was developed and is maintained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics. LX-Sentence Splitter marks sentence boundaries with <s>…</s>, and p...
LX Semantic Similarity is an online service for measuring the semantic similarity between words in Portuguese. This service uses the LX-DSemVectors, a distributional semantics model (a.k.a. word embeddings) of the Portuguese language. The model represents each word in its vocabulary by a vecto...
CINTIL Corpus Concordancer is a freely available online concordancing service to support the research usage of the CINTIL Corpus. This concordancer was developed and is maintained at the University of Lisbon by the NLX-Natural Language and Speech Group of the Department of Informatics, in coopera...
The Computational Linguistics Toolset is a set of tools for computational linguistics. It contains re-usable code for cleaning, splitting, refining, and taking samples from corpora (ICE, Penn, and a native one), for tagging them using the TnT-tagger, for doing permutation statistics on N-grams (u...