CINTIL-Corpus Internacional do Português

CINTIL Corpus

CINTIL-Corpus Internacional do Português is a linguistically interpreted corpus of Portuguese. At present it is composed of 1 Million annotated tokens, verified by human expert annotators. The annotation comprises information on part-of-speech, open classes lemma and inflection, multi-word expressions pertaining to the class of adverbs and to the closed POS classes, and multi-word proper names (for named entity recognition). The corpus has been developed at the University of Lisbon by the NLX group at the Faculty of Sciences and the Grammar and Resources (formerly ANAGRAMA) group at the Centro de Linguística da Universidade de Lisboa.

Download







  • LX-Tagger

  • LX-Tagger







      People who looked at this resource also viewed the following:
      Resources from the same project
      Resources from the same creators