Romanian - English news corpus (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual Romanian – English news corpus built from Sout...

Resource Type:Corpus
Media Type:Text
Languages:English
Romanian
Blacklist Classifier

A language identifier for closely related languages.

Resource Type:Tool / Service
Languages:Bosnian
Croatian
Czech
Portuguese
Serbian
Slovak
Central Statistical Office Dataset (Processed)  

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Two Polish-English publications of the Polish Central St...

Resource Type:Corpus
Media Type:Text
Languages:English
Polish
ExtraGLUE

ExtraGLUE is a Portuguese dataset obtained by the automatic translation of some of the tasks in the GLUE and SuperGLUE benchmarks. Two variants of Portuguese are considered, namely European Portuguese and American Portuguese. The 14 tasks in extraGLUE cover different aspects of language unders...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Polish Food 4 & Food Policy Dataset (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A collection of Polish-English translations of the Polis...

Resource Type:Corpus
Media Type:Text
Languages:English
Polish
Maltese-English website parallel corpus (Processed)  

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. This is a parallel corpus of bilingual texts crawled fro...

Resource Type:Corpus
Media Type:Text
Languages:English
Maltese
Compendium The Social Insurance Institution (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A compendium on the Polish Social Insurance Insitution (...

Resource Type:Corpus
Media Type:Text
Languages:English
Polish
Malta Government Gazette (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual gazette (English-Maltese) of the government of...

Resource Type:Corpus
Media Type:Text
Languages:English
Maltese
Summ-it

The corpus was developed as a linguistic resource for Automatic Summarization research and his relation with different issues to engage studies on the discourse treatment. Summ-it consists of fifty texts from Science domain extracted from Science section of Brazilian daily newspaper Folha de Sã...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Perfil Sociolinguístico da Fala Bracarense - POS

Perfil Sociolinguístico da Fala Bracarense - POS is a manually verified part-of-speech annotation of the EXMARaLDA transcriptions in "Perfil Sociolinguístico da Fala Bracarense", a Portuguese speech corpus with 90 hours of recorded spontaneous speech, aligned with its transcription in EXMARaLDA f...

Resource Type:Corpus
Media Type:Text
Language:Portuguese

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)