LX-NER is a freely available online service for the recognition of expressions for named entities in Portuguese. It was developed and is maintained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics.
LX-NER takes a segment of Portuguese text and identifies, circumscribes and classifies the expressions for named entities it contains. Furthermore, each named entity receives a standard representation. It handles the following types of expressions: - Number-based expressions: numbers, measures, time, addresses - Name-based expressions: persons, organizations, locations, events, works, miscellaneous
The number-based component is built upon handcrafted regular expressions. It was developed and evaluated against a manually constructed test-suite including over 300 examples. It scored 85.19% precision and 85.91% recall.
The name-based component is built upon stochastic procedures. It was trained over a manually annotated corpus of approximately 208,000 words, and evaluated against an unseen portion with approximately 52,000 words. It scored 86.53% precision and 84.94% recall.