Parallel corpus (Greek - English) in the law domain (Processed) (Part1)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel (el-en) corpus of 1979 translation units in the...

Resource Type:Corpus
Media Type:Text
Languages:English
Greek, Modern (1453-)
CorPop: a corpus of popular Brazilian Portuguese

This research proposes a corpus of popular Brazilian Portuguese, called CorPop, with texts selected based on the average level of literacy of the country's readers. CorPop’s theoretical and methodological bases are interdisciplinary and fall within the scope of Language Studies and related discip...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Laws of Malta (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Compilation of bilingual Maltese legislation (Maltese-En...

Resource Type:Corpus
Media Type:Text
Languages:English
Maltese
General Romanian-English bilingual corpus (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Romanian – English corpus built from a Wikipedia dump.

Resource Type:Corpus
Media Type:Text
Languages:English
Romanian
BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P. (Part I)

BDCamões Corpus is a collection of literary documents written in Portuguese, in plain text .txt format, with close to 4 million words from over 200 complete documents from 83 authors in 14 genres, covering a time span from the 15th to the 21st century, and adhering to different orthographic conve...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
C-ORAL-ROM_EXM

This resource includes a spoken corpus with approximately 300.000 words, covering both formal (152.755 words) and informal (165.838 words) speech, with aligned sound and orthographic transcription and POS-tag information.

Resource Type:Corpus
Media Types:Text
Audio
Language:Portuguese
BDCamões DependencyBank (Part II)

BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P., is a collection of literary documents written in Portuguese, in plain text .txt format, with close to 4 million words from over 200 complete documents from 83 authors in 14 genres, covering a ti...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
BDCamões DependencyBank (Part I)

BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P., is a collection of literary documents written in Portuguese, in plain text .txt format, with close to 4 million words from over 200 complete documents from 83 authors in 14 genres, covering a ti...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P. (Part II)

BDCamões Corpus - Collection of Portuguese Literary Documents from the Digital Library of Camões I.P., is a collection of literary documents written in Portuguese, in plain text .txt format, with close to 4 million words from over 200 complete documents from 83 authors in 14 genres, covering a ti...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
A Tweet Dataset Annotated in Four Emotion Dimensions

A corpus of 2,019 tweets annotated along each of four emotion dimensions: Valence, Dominance, Arousal and Surprise. Two annotation schemes are used: a 5-point ordinal scale (using SAM manikins for Valence, Arousal and Dominance) and pair-wise comparisons with an "about the same" option (here 2,01...

Resource Type:Corpus
Media Type:Text
Language:English

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)