Maltese Wikipedia

This corpus is part of the collection of the Wikipedia Dumps which was retrieved from wikipedia.org on April 8, 2010. It comes with two individual XML files, one containing the Wikipedia articles and another containing the metadata about it.

Resource Type:Corpus
Media Type:Text
Language:Maltese
Maltese Wordlist

Wordlist for spell-checking

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
Termcat Fairs and Congresses

Terms for Fairs and Congresses

Resource Type:Lexical / Conceptual
Media Type:Text
Languages:Catalan; Valencian
English
French
German
Italian
Portuguese
Spanish; Castilian
Bilingual documents Bulgarian-English in the field of open data, broadband and information society (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. English-Bulgarian collection in the field of open data, ...

Resource Type:Corpus
Media Type:Text
Languages:Bulgarian
English
Termcat Exotic Wood

Terms of Exotic Wood

Resource Type:Lexical / Conceptual
Media Type:Text
Languages:Catalan; Valencian
English
French
German
Italian
Portuguese
Spanish; Castilian
Bilingual Bulgarian-English corpus from the National Revenue Agency (BG) (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual Bulgarian-English corpus of administrative doc...

Resource Type:Corpus
Media Type:Text
Languages:Bulgarian
English
TakeLab Vectors

This resource includes the distributional semantic vectors used for the replication of the TakeLab system (https://github.com/nlx-group/arct-rep-rev). The TakeLab system is an automatic classifier for the Argument Reasoning Comprehension Task (https://www.aclweb.org/anthology/S18-1121/). The ...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:English
Embeddings for Comparative Probing of Lexical Semantics Theories

Embeddings used in: Branco, António, João Rodrigues, Małgorzata Salawa, Ruben Branco and Chakaveh Saedi, 2020. Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness. In Proceedings of the International Conference on Computational Linguistics (C...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
CRPC-Quotations

Database with 2.253 citations extracted from the Corpus de Referência do Português Contemporâneo - CRPC (Reference Corpus of Contemporary Portuguese) and manually revised. Format: tab separated file Fields: - context number - source file id - citation

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Termcat Neoloteca

Terms that have (more or less) recently been accepted and normalised by Termcat, mixed fields

Resource Type:Lexical / Conceptual
Media Type:Text
Languages:Basque
Catalan; Valencian
English
French
Galician
German
Italian
Latin
Portuguese
Spanish; Castilian

Order by:

Filter by:

Text (446)
Audio (18)
Image (1)