CRPC Discourse Bank v1.0

CRPC-DB 1.0

The CRPC Discourse Bank is labeled for discourse relations (also referred to as rhetorical relations or coher- ence relations), such as cause and condition, that hold between two spans of text and contribute to ensure the overall cohesion and coherence of the text. The scheme follows the principles of the PDTB annotation proposal and includes the updates of the PDTB 3.0 version. The annotation is applied over the PAROLE corpus, a written subset of the Reference Corpus of Contemporary Portuguese (CRPC) available on the ELRA catalogue, composed of newspapers, fiction and didactic/scientific texts. The discourse banks contains 65 texts and a total of 85.510 tokens, and 14,436 discourse relations. The annotation of the CRPC-DB applies at intra and inter-sentential levels and uses the relation types of the PDTB 3.0 (Explicit, Implicit, Alternative Lexicalization (AltLex), Alternative LexicalizationC (AltLexC), Entity Relation and No Relation.
Please, cite this publication when using the resource:
Mendes, Amália & Pierre Lejeune (2022). CRPC-DB – A Discourse Bank for Portuguese. In Computational Processing of the Portuguese Language PROPOR 2022, Lecture Notes in Computer Science, vol. 13208 (pp. 79-89). Berlin, Heidelberg: Springer.

Contact Resource Maintainer


      People who looked at this resource also viewed the following: