LX-DepParser (beta) is a free online service for the syntactic analysis of Portuguese. It allows the automatic parsing of sentences in Portuguese in terms of their grammatical functions.
LX-DepParser is a MSTParser trained with Portuguese data.
For the training of the parser, 22,118 sentences were used (comprising 250,056 word tokens). The sentences were taken from the CINTIL-Treebank. This treebank is being developed and maintained at the University of Lisbon by the NLX-Speech and Natural Language Group of the Department of Informatics. In terms of evaluation, LX-DepParser's UAS (unlabeled attachment score) is 94.42 and its LAS (labeled attachment score) is 91.23. Scores were obtained through 10-fold cross-validation.
The analyses produced by LX-DepParser are similar to the dependency representations found in the dependency treebank on which LX-DepParser was trained. This dependency treebank was designed along the principles described in the following handbook:
Branco António, Sérgio Castro, João Silva, Francisco Costa, 2011, CINTIL DepBank Handbook: Design options for the representation of grammatical dependencies. Department of Informatics, University of Lisbon, Technical Reports series, nb. di-fcul-tr-11-03.
LX-DepParser is being developed by Rúben Reis, under the direction of António Branco in NLX-Group on Natural Language and Speech.
You can contact us at the following email address: 'nlx' followed by '@' followed by 'di.fc.ul.pt'.
LX-DepParser was partially funded by FCT-Foundation for Science and Technology, under the contract FCT/PTDC/PLP/81157/2006 for the project SemanticShare.