A Repository of State of the Art and Competitive Baseline Summaries for DUC 2004

In the period since 2004, many novel sophisticated approaches for generic multi-document summarization have been developed. Intuitive simple approaches have also been shown to perform unexpectedly well for the task. Yet it is practically impossible to compare the existing approaches directly, bec...

Resource Type:Corpus
Media Type:Text
Language:English
Perfil Sociolinguístico da Fala Bracarense - POS

Perfil Sociolinguístico da Fala Bracarense - POS is a manually verified part-of-speech annotation of the EXMARaLDA transcriptions in "Perfil Sociolinguístico da Fala Bracarense", a Portuguese speech corpus with 90 hours of recorded spontaneous speech, aligned with its transcription in EXMARaLDA f...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
English-Estonian corpus from Finnish Information Bank (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. http://www.infopankki.fi - Finland in your language - In...

Resource Type:Corpus
Media Type:Text
Languages:English
Estonian
English-Finnish corpus from Finnish Information Bank (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. http://www.infopankki.fi - Finland in your language - In...

Resource Type:Corpus
Media Type:Text
Languages:English
Finnish
Bilingual hr-en parallel corpus from the Journal of the Croatian Association of Civil Engineers website (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Contents of http://casopis-gradjevinar.hr were crawled, ...

Resource Type:Corpus
Media Type:Text
Languages:Croatian
English
Greek anti-corruption legislation and National Anti-Corruption Plan (greek-english) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Greek laws, ratification of International Conventions ag...

Resource Type:Corpus
Media Type:Text
Languages:English
Greek, Modern (1453-)
Convention on the transfer of sentenced persons (English - Greek) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Convention, additional protocol on the convention, recom...

Resource Type:Corpus
Media Type:Text
Languages:English
Greek, Modern (1453-)
Central Statistical Office Dataset (Processed)  

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Two Polish-English publications of the Polish Central St...

Resource Type:Corpus
Media Type:Text
Languages:English
Polish
Maltese-English website parallel corpus (Processed)  

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. This is a parallel corpus of bilingual texts crawled fro...

Resource Type:Corpus
Media Type:Text
Languages:English
Maltese
Portuguese-English bilingual corpus from Legislation concerning the Portuguese Parliament (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Legislation concerning Portuguese Parliament; three bili...

Resource Type:Corpus
Media Type:Text
Languages:English
Portuguese

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)