LX-Chunker
Handle: | https://hdl.handle.net/21.11129/0000-000B-D2F9-F (persistent URL to this page) |
---|---|
URL: | http://lxcenter.di.fc.ul.pt/tools/en/LXChunkerEN.html |
The present tool, that was built to deal with specific issues concerning orthographic conventions adopted for Portuguese, marks sentence boundaries with <s>…</s>, and paragraph boundaries with <p>…</p>. Unwraps sentences split over different lines.
A f-score of 99.94% was obtained when testing on a 12,000 sentence corpus accurately hand tagged with respect to sentence and paragraph boundaries.
LX-Chunker was developed and is maintained at University of Lisbon by the NLX-Natural Language and Speech Group of the Department of Informatics.
People who looked at this resource also viewed the following:
People who downloaded this resource also downloaded the following: