Search and Browse – PORTULAN CLARIN

Fundamental Portuguese

This resource includes a spoken Portuguese corpus - with aligned sound and orthographic transcription -, collected among sociolinguistically diverse speakers. It consists of recordings from informal conversations.

Resource Type:	Corpus
Media Types:	Text
Media Types:	Audio
Language:	Portuguese

EUROPARL Corpus Parallel Corpora: Portuguese-English

The EUROPARL Corpus (subpart Portuguese-English of the parallel corpora), available at http://www.statmt.org/europarl/, was extracted from the proceedings of the European Parliament (Koehn, 2005). It contains transcriptions of sessions dating back from 1996 to 2011, in a total of approximately 58...

Resource Type:	Corpus
Media Type:	Text
Languages:	English
Languages:	Portuguese

C-ORAL-ROM_EXM

This resource includes a spoken corpus with approximately 300.000 words, covering both formal (152.755 words) and informal (165.838 words) speech, with aligned sound and orthographic transcription and POS-tag information.

Resource Type:	Corpus
Media Types:	Text
Media Types:	Audio
Language:	Portuguese

CINTIL-Corpus Internacional do Português

CINTIL-Corpus Internacional do Português is a linguistically interpreted corpus of Portuguese. At present it is composed of 1 Million annotated tokens, verified by human expert annotators. The annotation comprises information on part-of-speech, open classes lemma and inflection, multi-word expres...

Resource Type:	Corpus
Media Type:	Text
Language:	Portuguese

Order by:

Filter by: