LX-Sentence Splitter

LX-Sentence Splitter is a language processing tool for delimiting sentences in Portuguese. It was developed and is maintained by the NLX-Natural Language and Speech Group at the University of Lisbon, Department of Informatics.

LX-Sentence Splitter marks sentence boundaries with <s>…</s>, and paragraph boundaries with <p>…</p>. It also unwraps sentences split over different lines. A f-score of 99.94% was obtained when testing on a 12,000 sentence corpus accurately hand tagged with respect to sentence and paragraph boundaries.

You may also be interested to use our LX-Tokenizer, LX-Tagger, or LX-Suite online services for the tokenization, part-of-speech tagging, and sub-syntactic analysis of Portuguese.

Download





People who looked at this resource also viewed the following:
Resources from the same project
Resources from the same creators