The resource constitues of a hierarchically-structured system of data types, which is intended to be suitable for describing the inputs and output annotation types of a wide range of natural language processing applications which operate within the UIMA Framework. It is being developed in conjunc...
This resource includes a spoken corpus with approximately 300.000 words, covering both formal (152.755 words) and informal (165.838 words) speech, with aligned sound and orthographic transcription and POS-tag information.
LX-Suite is a freely available online service for the shallow processing of Portuguese. It was developed and is maintained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics. LX-Suite is composed by a set of shallow processing tools: - LX Sente...