The Coimisineir Teanga Bilingual Corpus of Reference Documents (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. General Reference content from the Language Commissioner...

Resource Type:Corpus
Media Type:Text
Languages:English
Irish
Monolingual documents from the Government of Lithuania (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Monolingual documents received from the Government of th...

Resource Type:Corpus
Media Type:Text
Language:Lithuanian
The Gaois bilingual corpus of English-Irish legislation (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual corpus of English-Irish legislation provided b...

Resource Type:Corpus
Media Type:Text
Languages:English
Irish
Slovenian-English corpus with statistical reports from the Statistical Office of the Republic of Slovenia website (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Slovenian-English corpus with statistical reports from t...

Resource Type:Corpus
Media Type:Text
Languages:English
Slovenian
Secretariat-General parallel corpus SL-EN and EN-SL (part 1) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. English-Slovenian parallel corpus in TMX format from the...

Resource Type:Corpus
Media Type:Text
Languages:English
Slovenian
Secretariat-General parallel corpus SL-EN and EN-SL (part 2) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. English-Slovenian parallel corpus in TMX format from the...

Resource Type:Corpus
Media Type:Text
Languages:English
Slovenian
An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis

An Arabic twitter data set of 7,503 tweets. The released data contains manual Sentiment Analysis annotations as well as automatically extracted features, saved in Comma Separated (CSV) and Attribute-Relation File Format (ARFF) file formats. Due to twitter privacy restrictions we replaced the orig...

Resource Type:Corpus
Media Type:Text
Language:Arabic
Corpus of Semantic Graphs with associated English strings

Automatically generated corpus of 98,818 graph/string pairs.

Resource Type:Corpus
Media Type:Text
Language:American English
Burst-Annotated Co-Occurrence Network for the Arab Spring Domain

A burst-annotated co-occurrence network about the Arab Spring topic built on the top of New York Times article snapshots from the years 2010-2013.

Resource Type:Corpus
Media Type:Text
Language:American English
Manually annotated corpora for teaching and learning purposes of Brazilian Portuguese, Dutch, Estonian, and Slovene

These are manually annotated corpora for teaching and learning purposes of Brazilian Portuguese, Dutch, Estonian, and Slovene, as a contribution to the Manually Annotated Corpora Family available in CLARIN. Sentences are annotated with “problematic” or “non-problematic” labels, from the point of ...

Resource Type:Corpus
Media Type:Text
Languages:Brazilian Portuguese
Dutch
Estonian
Slovene

Order by:

Filter by:

Text (445)
Audio (18)
Image (1)