Search and Browse – PORTULAN CLARIN

Bilingual collection of reports of the Greek Public Power Corporation (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A bilingual collection of translation units extracted fr...

Resource Type:	Corpus
Media Type:	Text
Languages:	English
Languages:	Greek, Modern (1453-)

English-Estonian Parallel corpus compiled from translated annual reports from Estonian Academy of Sciences

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. English-Estonian translated annual reports as source dat...

Resource Type:	Corpus
Media Type:	Text
Languages:	English
Languages:	Estonian

Alignment of Parallel Texts from Cyrillic to Latin

The text of the novel Sania (eng. The Sledge) served as a training corpus. It was written in 1955 by Ion Druță and printed originally in Cyrillic scripts. We have followed a special previously developed technology of recognition and specialized lexicons. In such a way, we have obtained the electr...

Resource Type:	Corpus
Media Type:	Text
Language:	Romanian

Grafone-LEX

Grafone-LEX is a lexical database for conversion from graphemes to phonemes

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	Portuguese

LX-UDParser

LX-UDParser is a UD parser for Portuguese, which adopts the Universal Dependency framework, with an initial performance of 90.87 for UAS and 88.01 for LAS under a ten-fold cross validation scheme. It is described in this article: António Branco, João Ricardo Silva, Luís Gomes and João Rodri...

Resource Type:	Tool / Service
Language:	Portuguese

UIMA Apertium Translator

This tool translates text from a source language into a target language. It operates on text that has previously been tokenised and morphologically analysed, and POS-tagged. Target language tokens are assigned POS tags and morphological analyses. The Apertium Translator is a module of Apertium ma...

Resource Type:	Tool / Service
Languages:	Basque
	Catalan
	English
	Galician
	Portuguese
	Spanish

Bilingual Bulgarian-English corpus from the 2018 Proposal for a National Climate Change Adaptation Strategy and Action Plan from the website of the Bulgarian Ministry of Environment and Water (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual Bulgarian-English corpus from the 2018 Proposa...

Resource Type:	Corpus
Media Type:	Text
Languages:	Bulgarian
Languages:	English

QTLeap WSD/NED corpus

QTLeap WSD/NED corpus This corpora is part of Deliverable 5.5 of the European Commission project QTLeap FP7-ICT-2013.4.1-610516 (http://qtleap.eu). The texts are Q&A interactions from the real-user scenario (batches 1 and 2). The interactions in this corpus are available in Basque, Bulgar...

Resource Type:	Corpus
Media Type:	Text
Languages:	Basque
	Bulgarian
	Czech
	English
	Portuguese
	Spanish; Castilian

Adimen-SUMO v2.6

Adimen-SUMO is an off-the-shelf first-order ontology that has been obtained by reengineering out of the 88% of SUMO (Suggested Upper Merged Ontology). Adimen-SUMO can be used appropriately by FO theorem provers (like E-Prover or Vampire) for formal reasoning.

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	English

AcadOnto

An academic domain ontology populated using IIT Bombay organization corpus, web and the linked open data.

Resource Type:	Lexical / Conceptual
Media Type:	Text
Language:	English

Order by:

Filter by: