FLY corpus - syntax

FLY Corpus is a corpus composed by 2000 informal letters written in Portuguese, in the years spanning from 1900 to 1974, in the context of war, migration, imprisonment and exile.

Each letter is in an XML file with two main parts: (a) the header, which contains metadata about the document (the transcribers, extra-linguistic context, etc) and (b) the semi-palaeographic edition of the letter, split into opening elements, body and closing elements.

This release of FLY Corpus also includes a version of the corpus where the body of the letter has been automatically annotated by a constituency parser.


