The CINTIL-NamedEntities corpus, built upon the CINTIL International Corpus of Portuguese (Barreto et al., 2006), is composed of 30,493 sentences of written Portuguese with named entities manually disambiguated and annotated with links to appropriate pages in the Portuguese Dbpedia (Lehmann et al., 2012). From a total of 684,467 tokens from which 26,371 named entities were recognized, 16,120 have been annotated with links to appropriate entires in DBpedia.

The development of the CINTIL-NamedEntities corpus has been funded by the EU project QTLeap (EC/FP7/610516) and the Portuguese project DP4LT (PTDC/EEI-SII/1940/2012).


People who looked at this resource also viewed the following:
People who downloaded this resource also downloaded the following:
Resources from the same project
Resources from the same creators