Corpus de Produções Escritas de Aprendentes de PL2 (PEAPL2)

A Portuguese as a non-native language learners' corpus of written texts with three independent subcorpora: - Portuguese as a Foreign Language: Subcorpus Português Língua Estrangeira (PEAPL2_PLE) http://teitok2.iltec.pt/peapl2-ple/index.php?action=home - East Timorese Portuguese: Subcorpus T...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
FLY corpus - morpho

FLY Corpus is a corpus composed by 2000 informal letters written in Portuguese, in the years spanning from 1900 to 1974, in the context of war, migration, imprisonment and exile. Each letter is in an XML file with two main parts: (a) the header, which contains metadata about the document (the ...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Perfil Sociolinguístico da Fala Bracarense

Perfil Sociolinguístico da Fala Bracarense is a Portuguese speech corpus with 90 hours of recorded spontaneous speech, aligned with its transcription in EXMARaLDA format. The corpus is composed by 1h interviews with speakers of the same area (around Braga, Portugal), stratified according to sex,...

Resource Type:Corpus
Media Types:Text
Audio
Language:Portuguese
Arquivo Dialetal CLUP - POS

Arquivo Dialetal CLUP - POS is a speech corpus with approximately 40 000 tokens (Utterances; spontaneous speech, mainly from Northern Portugal). Orthographic transcription, POS.

Resource Type:Corpus
Media Type:Audio
Language:Portuguese
BDCamões DependencyBank (Part I)

BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P., is a collection of literary documents written in Portuguese, in plain text .txt format, with close to 4 million words from over 200 complete documents from 83 authors in 14 genres, covering a ti...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
PAROLE Portuguese Lexicon

The resource is constituted by 20 thousand entries morpho-syntactically and syntactically encoded, accordingly to the parole common encoding standards.

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
Fundamental Portuguese

This resource includes a spoken Portuguese corpus - with aligned sound and orthographic transcription -, collected among sociolinguistically diverse speakers. It consists of recordings from informal conversations.

Resource Type:Corpus
Media Types:Text
Audio
Language:Portuguese
Grafone-Tool

Grafone-Tool is a tool for conversion from grapheme to phoneme for European Portuguese. The converter works with the Portuguese spelling, both prior to and after the Orthographic Agreement of 1990.

Resource Type:Tool / Service
Language:Portuguese
CINTIL-TreeBank

The CINTIL-TreeBank (Branco et al., 2011) is a corpus of syntactic constituency trees of Portuguese texts composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 sentences; 3,082 tokens). In addition, there are ...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
CINTIL-DependencyBank

The CINTIL-DependencyBank (Branco et al., 2011a) is a corpus of grammatical dependencies of Portuguese texts composed of 10,039 sentences and 110,166 tokens taken from different sources and domains: news (8,861 sentences; 101,430 tokens), novels (399 sentences; 3,082 tokens) (see 3.2.). In additi...

Resource Type:Corpus
Media Type:Text
Language:Portuguese

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)