Romanian-English corpus with studies, reports and statistical data in the field of culture from the National Institute for Cultural Research and Training website (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Romanian-English corpus with studies, reports and statis...

Resource Type:Corpus
Media Type:Text
Languages:Moldavian; Moldovan
Romanian
Secretariat-General parallel corpus SL-EN and EN-SL (part 2) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. English-Slovenian parallel corpus in TMX format from the...

Resource Type:Corpus
Media Type:Text
Languages:English
Slovenian
Secretariat-General parallel corpus SL-EN and EN-SL (part 1) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. English-Slovenian parallel corpus in TMX format from the...

Resource Type:Corpus
Media Type:Text
Languages:English
Slovenian
The Coimisineir Teanga Bilingual Corpus of Reference Documents (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. General Reference content from the Language Commissioner...

Resource Type:Corpus
Media Type:Text
Languages:English
Irish
Monolingual documents from the Government of Lithuania (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Monolingual documents received from the Government of th...

Resource Type:Corpus
Media Type:Text
Language:Lithuanian
English-Norwegian parallel corpus from Forbruker Europa, 2017 release (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Forbruker Europa is the Norwegian office of the European...

Resource Type:Corpus
Media Type:Text
Languages:Bokmål, Norwegian; Norwegian Bokmål
English
An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis

An Arabic twitter data set of 7,503 tweets. The released data contains manual Sentiment Analysis annotations as well as automatically extracted features, saved in Comma Separated (CSV) and Attribute-Relation File Format (ARFF) file formats. Due to twitter privacy restrictions we replaced the orig...

Resource Type:Corpus
Media Type:Text
Language:Arabic
Corpus of Semantic Graphs with associated English strings

Automatically generated corpus of 98,818 graph/string pairs.

Resource Type:Corpus
Media Type:Text
Language:American English
Burst-Annotated Co-Occurrence Network for the Arab Spring Domain

A burst-annotated co-occurrence network about the Arab Spring topic built on the top of New York Times article snapshots from the years 2010-2013.

Resource Type:Corpus
Media Type:Text
Language:American English
Perfil Sociolinguístico da Fala Bracarense - POS

Perfil Sociolinguístico da Fala Bracarense - POS is a manually verified part-of-speech annotation of the EXMARaLDA transcriptions in "Perfil Sociolinguístico da Fala Bracarense", a Portuguese speech corpus with 90 hours of recorded spontaneous speech, aligned with its transcription in EXMARaLDA f...

Resource Type:Corpus
Media Type:Text
Language:Portuguese

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)