
File processing
Input format: Input files must be in .txt FORMAT with UTF-8 ENCODING and contain PORTUGUESE TEXT. Input files and folders can also be compressed to the .zip format.
Privacy: The input file you upload and the respective output files will be automatically deleted from our computer after being processed and the result downloaded by you. No copies of your files will be retained after your use of this service.
The size of your input file is large and its processing may take some time. To receive by email the url link from which to download your processed file when ready, enter your email address below. After being used for this purpose, your email address will be deleted from our computer.
Instructions to use this web service
The web service for this application is available at https://portulanclarin.net/workbench/lx-depparser/api/.
Below you find an example of how to use this web service with Python 3.
This example resorts to the requests package. To install this package, run this command in the command line:
pip3 install requests
.
To use this web service, you need an access key you can obtain by clicking in the button below. A key is valid for 31 days. It allows to submit a total of 500 million characters by means of requests with no more 2000 characters each. It allows to enter 100,000 requests, at a rate of no more than 200 requests per hour.
For other usage regimes, you should contact the helpdesk.
The input data and the respective output will be automatically deleted from our computer after being processed. No copies will be retained after your use of this service.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | import json import requests # to install this library, enter in your command line: # pip3 install requests # This is a simple example to illustrate how you can use the LX-DepParser web service # Requires: key is a string with your access key # Requires: text is a string, UTF-8, with a maximum 2000 characters, Portuguese text, with # the input to be processed # Requires: tagset is a string, indicating the tagset to be used in the output, which can # be either 'CINTIL' or 'UD' (universal dependencies) # Requires: format is a string, indicating the output format, which can be either # 'CONLL' or 'JSON' # Ensures: output according to specification in https://portulanclarin.net/workbench/lx-depparser/ # Ensures: dict with number of requests and characters input so far with the access key, and # its date of expiry key = 'access_key_goes_here' # before you run this example, replace access_key_goes_here by # your access key # this string can be replaced by your input text = '''A Maria tem razão. Mesmo assim, ensaia algumas aproximações. A emissão será cotada na Bolsa de Valores do Luxemburgo.''' tagset = 'CINTIL' # set this to 'UD' to get universal dependencies format = 'CONLL' # other possible value is 'JSON' # To read input text from a file, uncomment this block #inputFile = open("myInputFileName", "r", encoding="utf-8") # replace myInputFileName by # the name of your file #text = inputFile.read() #inputFile.close() # Processing: url = "https://portulanclarin.net/workbench/lx-depparser/api/" request_data = { 'method': 'parse', 'jsonrpc': '2.0', 'id': 0, 'params': { 'text': text, 'tagset': tagset, 'format': format, 'key': key, }, } request = requests.post(url, json=request_data) response_data = request.json() if "error" in response_data: print("Error:", response_data["error"]) else: print("Result:") print(response_data["result"]) # To write output in a file, uncomment this block #outputFile = open("myOutputFileName","w", encoding="utf-8") # replace myOutputFileName by # the name of your file #output = response_data["result"] #outputFile.write(output) #outputFile.close() # Getting acess key status: request_data = { 'method': 'key_status', 'jsonrpc': '2.0', 'id': 0, 'params': { 'key': key, }, } request = requests.post(url, json=request_data) response_data = request.json() if "error" in response_data: print("Error:", response_data["error"]) else: print("Key status:") print(json.dumps(response_data["result"], indent=4)) |
Access key for the web service
This is your access key for this web service.
The following access key for this web service is already associated with .
This key is valid until and can be used to process requests or characters.
Make sure to save this key before closing this dialog box.
Tag | Category |
---|---|
C | Complement |
CARD | Cardinal in multi-word cardinals |
COORD | Coordination |
CONJ | Conjunction |
DEP | Dependency |
DO | Direct Object |
IO | Indirect Object |
M | Modifier |
N | Name in multi-word proper names |
OBL | Oblique Complement |
PRD | Predicate |
PUNCT | Punctuation |
ROOT | Sentence root |
SJ | Subject |
SJac | Subject of an anticausative |
SJcp | Subject of complex predicate |
SP | Specifier |
Tag | Category |
---|---|
A | Adjective |
AP | Adjective Phrase |
ADV | Adverb |
ADVP | Adverb Phrase |
C | Complementizer |
CP | Complementizer Phrase |
CARD | Cardinal |
CONJ | Conjuction |
CONJP | Conjuction Phrase |
D | Determiner |
DEM | Demonstrative |
N | Noun |
NP | Noun Phrase |
P | Preposition |
PP | Preposition Phrase |
POSS | Possessive |
QNT | Predeterminer |
S | Sentence |
V | Verb |
VP | Verb Phrase |
Tag | Description |
---|---|
Tags for nominal categories | |
m | Masculine |
f | Feminine |
g | Indeterminate Gender |
s | Singular |
p | Plural |
n | Indeterminate Number |
dim | Diminutive |
sup | Superlative |
comp | Comparative |
Tags for verbs | |
1 | First Person |
2 | Second Person |
3 | Third Person |
pi | Presente do Indicativo |
ppi | Pretérito Perfeito do Indicativo |
ii | Pretérito Imperfeito do Indicativo |
mpi | Pretérito Mais que Perfeito do Indicativo |
fi | Futuro do Indicativo |
c | Condicional |
pc | Presente do Conjuntivo |
ic | Pretérito Imperfeito do Conjuntivo |
fc | Futuro do Conjuntivo |
imp | Imperativo |
Tags for infinitive verbs | |
ifl | Inflected |
nifl | Not Inflected |
LX-DepParser's documentation
LX-DepParser
LX-DepParser is a free online service for the syntactic analysis of Portuguese. It allows the automatic parsing of sentences in Portuguese in terms of the grammatical functions of their words.
This service was developed and is maintained at the University of Lisbon by the NLX-Speech and Natural Language Group, Department of Informatics.
Parser
LX-DepParser is a MSTParser trained with Portuguese data.
For the training of the parser, 22,118 sentences were used (comprising 250,056 word tokens). The sentences were taken from the CINTIL-DependencyBank. This treebank is being developed and maintained at the University of Lisbon by the NLX-Speech and Natural Language Group of the Department of Informatics. In terms of evaluation, LX-DepParser's UAS (unlabeled attachment score) is 94.42 and its LAS (labeled attachment score) is 91.23. Scores were obtained through 10-fold cross-validation.
Consequently, the parser output complies with the design options adopted for the construction of the CINTIL-DependencyBank (see "Annotation Guidelines" below). The output of the parser can be obtained also in the format of Google's so-called Universal Dependencies, which results from the conversion of the original CINTIL output format by means of a set of regular expression rules over dependency trees, from which some residual distortion cases may happen to be introduced.
Tagset
Gramatical function tagset
Tag | Category |
---|---|
C | Complement |
CARD | Cardinal in multi-word cardinals |
COORD | Coordination |
CONJ | Conjunction |
DEP | Dependency |
DO | Direct Object |
IO | Indirect Object |
M | Modifier |
N | Name in multi-word proper names |
OBL | Oblique Complement |
PRD | Predicate |
PUNCT | Punctuation |
ROOT | Sentence root |
SJ | Subject |
SJac | Subject of an anticausative |
SJcp | Subject of complex predicate |
SP | Specifier |
Part-of-speech tags (high granularity)
Tag | Category |
---|---|
A | Adjective |
AP | Adjective Phrase |
ADV | Adverb |
ADVP | Adverb Phrase |
C | Complementizer |
CP | Complementizer Phrase |
CARD | Cardinal |
CONJ | Conjuction |
CONJP | Conjuction Phrase |
D | Determiner |
DEM | Demonstrative |
N | Noun |
NP | Noun Phrase |
P | Preposition |
PP | Preposition Phrase |
POSS | Possessive |
QNT | Predeterminer |
S | Sentence |
V | Verb |
VP | Verb Phrase |
Inflection tags
Tag | Description |
---|---|
Tags for nominal categories | |
m | Masculine |
f | Feminine |
g | Indeterminate Gender |
s | Singular |
p | Plural |
n | Indeterminate Number |
dim | Diminutive |
sup | Superlative |
comp | Comparative |
Tags for verbs | |
1 | First Person |
2 | Second Person |
3 | Third Person |
pi | Presente do Indicativo |
ppi | Pretérito Perfeito do Indicativo |
ii | Pretérito Imperfeito do Indicativo |
mpi | Pretérito Mais que Perfeito do Indicativo |
fi | Futuro do Indicativo |
c | Condicional |
pc | Presente do Conjuntivo |
ic | Pretérito Imperfeito do Conjuntivo |
fc | Futuro do Conjuntivo |
imp | Imperativo |
Tags for infinitive verbs | |
ifl | Inflected |
nifl | Not Inflected |
Annotation guidelines
The analyses produced by LX-DepParser are similar to the dependency representations found in the CINTIL-DependencyBank on which LX-DepParser was trained. This dependency treebank was designed along the principles described in the following handbook:
- Branco António, Sérgio Castro, João Silva, Francisco Costa, 2011, CINTIL DepBank Handbook: Design options for the representation of grammatical dependencies. Department of Informatics, University of Lisbon, Technical Reports series, nb. di-fcul-tr-11-03.
Authorship
LX-DepParser was developed by Rúben Reis, under the direction of António Branco at the NLX-Group on Natural Language and Speech.
Publications
Irrespective of the most recent version of this tool you may use, when mentioning it, please cite this reference:
- Branco António, Sérgio Castro, João Silva, Francisco Costa, 2011, CINTIL DepBank Handbook: Design options for the representation of grammatical dependencies. Department of Informatics, University of Lisbon, Technical Reports series, nb. di-fcul-tr-11-03.
Contact us
You can contact us at the following email address: 'nlx' followed by '@' followed by 'di.fc.ul.pt'.
Acknowledgments
LX-DepParser was partially funded by FCT-Foundation for Science and Technology, under the contract FCT/PTDC/PLP/81157/2006 for the project SemanticShare.
License
No fee, attribution, all rights reserved, no redistribution, non commercial, no warranty, no liability, no endorsement, temporary, non exclusive, share alike.
The complete text of this license is here.