BDCamões DependencyBank (Part II)

BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P., is a collection of literary documents written in Portuguese, in plain text .txt format, with close to 4 million words from over 200 complete documents from 83 authors in 14 genres, covering a ti...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
BDCamões DependencyBank (Part I)

BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P., is a collection of literary documents written in Portuguese, in plain text .txt format, with close to 4 million words from over 200 complete documents from 83 authors in 14 genres, covering a ti...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P. (Part II)

BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P., is a collection of literary documents written in Portuguese, in plain text .txt format, with close to 4 million words from over 200 complete documents from 83 authors in 14 genres, covering a ti...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Biographies of Portuguese People

This is a set of 11.361 biographies of Portuguese people. The compilation of the data involved the biography collection from wikipedia and data conversion. Several filters were applied to remove entries that were mostly empty or non applicable content. Format: JSON (conversion from HTML) ...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
PS corpus (Post-Scriptum)-PT

PS Corpus (Post-Scriptum)-PT is a corpus of 2215 informal mail letters written in Portuguese during the Modern Ages (from the XVIth century to the beginning of the XIXth century). Each letter is available as a semi-palaeographic transcription, a modernized transcription, and with part-of-speec...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
CRPC-Quotations

Database with 2.253 citations extracted from the Corpus de Referência do Português Contemporâneo - CRPC (Reference Corpus of Contemporary Portuguese) and manually revised. Format: tab separated file Fields: - context number - source file id - citation

Resource Type:Corpus
Media Type:Text
Language:Portuguese
VIDiom-PT

VIDiom-PT is a European Portuguese corpus annotated for verbal idioms, designed to support NLP applications in idiom processing. The resulting corpus comprises 5,178 annotated instances covering 747 distinct verbal idioms. The annotation process was validated through an inter-annotator agreement ...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Albertina PT-BR No-brWaC

Albertina PT-* is a foundation, large language model for the Portuguese language. It is an encoder of the BERT family, based on the neural architecture Transformer and developed over the DeBERTa model, and with most competitive performance for this language. It has different versions that were...

Resource Type:Language Description
Media Type:Text
Language:Portuguese
Gervásio PT-BR base

Gervásio PT-* is a foundation, large language model for the Portuguese language. It is a decoder of the GPT family, based on the neural architecture Transformer and developed over the Pythia model, with competitive performance for this language. It has different versions that were trained for ...

Resource Type:Language Description
Media Type:Text
Language:Portuguese
Albertina PT-BR base

Albertina PT-BR base is a foundation, large language model for American Portuguese from Brazil. It is an encoder of the BERT family, based on the neural architecture Transformer and developed over the DeBERTa model, with most competitive performance for this language. It is distributed free of...

Resource Type:Language Description
Media Type:Text
Language:Portuguese

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)