MLSS Paragraph Splitter Web service

The paragraph splitter is a web service tool which takes text as input and outputs the identified paragraphs surrounded by tags. The tool is language independent. The download for this resource only contains the narrative description in a Word file. The service has one method which can be invo...

Resource Type:Tool / Service
Language:Maltese
English-Lithuanian EASTIN-CL Multilingual Ontology of Assistive Technology (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. EASTIN-CL Multilingual Ontology of Assistive Technology ...

Resource Type:Lexical / Conceptual
Media Type:Text
Languages:English
Lithuanian
Polish Ministry of Foreign Affairs Youth 2011 Report (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A parallel Polish-English version of the Youth 2011 repo...

Resource Type:Corpus
Media Type:Text
Languages:English
Polish
Parallel corpus (en-pl) from the Export Promotion Portal of Poland (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A paralell corpus constructed from data acquired form th...

Resource Type:Corpus
Media Type:Text
Languages:English
Polish
Macroeconomic Developments (Processed)  

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bulletins of Macroeconomic Developments

Resource Type:Corpus
Media Type:Text
Languages:English
Greek, Modern (1453-)
STEPP Tagger

Part-of-speech tagger tuned to biomedical text, provided as a web service.

Resource Type:Tool / Service
Language:English
FORMA

FORMA is a probabilistic tool for morphological tagging and lemmatization of text. The purpose of this tool is to obtain annotated text to be processed by other NLP tools (see Gonzalez et al., 2006).

Resource Type:Tool / Service
Parallel corpus (Greek - English) in the law domain (Processed) (Part1)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Parallel (el-en) corpus of 1979 translation units in the...

Resource Type:Corpus
Media Type:Text
Languages:English
Greek, Modern (1453-)
English-Swedish parallel corpus from the translation of 'Sweden a Pocket Guide' book (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A guide for foreigners who move to Sweden. Source langua...

Resource Type:Corpus
Media Type:Text
Languages:English
Swedish
Carolina: General Corpus of Contemporary Brazilian Portuguese with provenance and typology information

Carolina is an open corpus for Linguistics and Artificial Intelligence with a robust volume of texts of varied typology in contemporary Brazilian Portuguese (1970-2021).

Resource Type:Corpus
Media Type:Text
Language:Brazilian Portuguese

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)