DVPM-browser

DVPM-browser is a browser for the DVPM lexical database of medieval Portuguese.

Resource Type:Tool / Service
CINTIL-Corpus Internacional do Português

CINTIL-Corpus Internacional do Português is a linguistically interpreted corpus of Portuguese. At present it is composed of 1 Million annotated tokens, verified by human expert annotators. The annotation comprises information on part-of-speech, open classes lemma and inflection, multi-word expres...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
GENIA Event Corpus with meta-knowledge annotation

The corpus consists of 1000 MEDLINE abstracts. It is a subset of the original GENIA POS & term corpus, which was selected using the three MeSH terms human, blood cells and transcription factors. In each sentence, three types of information are annotated 1) biomedical terms are identified and assi...

Resource Type:Corpus
Media Type:Text
Language:English
Croatian-English corpus with the Rural Development Programme for the Period 2014-2020 from the Croatian Rural Development Programme website (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Croatian-English corpus with the Rural Development Progr...

Resource Type:Corpus
Media Type:Text
Languages:Croatian
English
PS corpus (Post-Scriptum)-PT

PS Corpus (Post-Scriptum)-PT is a corpus of 2215 informal mail letters written in Portuguese during the Modern Ages (from the XVIth century to the beginning of the XIXth century). Each letter is available as a semi-palaeographic transcription, a modernized transcription, and with part-of-speec...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Dizer

DiZer 2.0 is a web interface for discourse parsing. It is based on DiZer (Pardo and Nunes, 2008), the first discourse parser for Brazilian Portuguese. The system aims at producing the discourse structure of a source text following the Rhetorical Structure Theory – RST (Mann and Thompson, 1987), o...

Resource Type:Tool / Service
Language:Portuguese
The Coimisineir Teanga Bilingual Corpus of Reference Documents (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. General Reference content from the Language Commissioner...

Resource Type:Corpus
Media Type:Text
Languages:English
Irish
Lemmatizer for Portuguese

Based on the MXPOST part of speech tagger and UNITEX dictionaries for Portuguese, this tool produces the lemmas of the words of a text stored in a plain text file. The source code is also provided.

Resource Type:Tool / Service
Language:Portuguese
AcadOnto

An academic domain ontology populated using IIT Bombay organization corpus, web and the linked open data.

Resource Type:Lexical / Conceptual
Media Type:Text
Language:English
Croatian-English corpus with statistical reports and studies from the Croatian Bureau of Statistics website (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Croatian-English corpus with statistical reports and stu...

Resource Type:Corpus
Media Type:Text
Languages:Croatian
English

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)