Maltese Speech Engine Database

Description

Resource Type:Corpus
Media Types:Text
Audio
Language:Maltese
Maltese Fiction Wordlist

This is a wordlist which was created from 32 Maltese fiction books. These texts were originally in PDF file format and were converted to txt format. In the next step, the text file was tokenized and a frequency count was performed on the separate tokens. The resulting list (with about 50,000 entr...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
MALTESE AUTOMATIC COLLOCATIONS DICTIONARY

Maltese Automatic Collocations Dictionary =========================================== Lexical Computing Limited, October 2012 This is an Automatic Collocations Dictionary produced by Lexical Computing Limited, for delivery to the EU CESAR project. The method is • Take a corpus of the l...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
Maltese automatically produced distributional thesaurus

This is an automatically produced distributional thesaurus, which finds words that tend to occur in similar contexts as the target word. It is not a manually constructed thesaurus of synonyms. It was produced by Lexical Computing Ltd on the basis of the MLRS corpus. The file contains lines with...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
Maltese Acquis Communautaire

This is the Maltese version of the Acquis Communautaire (AC), which is the total body of European Union (EU) law applicable in the EU Member States. It consists of selected texts between the 1950s and today, translated to Maltese.

Resource Type:Corpus
Media Type:Text
Language:Maltese
LX-WordSim-353

The LX-WordSim-353 was created from WordSim-353 (Agirre et al., 2009). As the name suggests, this data set contains 353 pairs of words. Both words in each pair can have different morphosyntactic categories. The data set is made of nouns, adjectives, verbs and named entities, and has no multiwords...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
LX-Stopwords

LX-Stopwords resource is a manual list of words from Portuguese composed by 2631 words of 51 types. The words are grouped in three big classes, arranged according to their morpho-syntactic category and inflectional feature value (closed classes, open classes, and multi-word units). This list was ...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
LX-SimLex-999

The LX-SimLex-999 was created from SimLex-999 (Hill et al., 2015) which, in turn, was based in the University of South Florida Free Association Database (USF) (Nelson et al., 2014). There were strict guidelines to create SimLex-999. Both words in each pair have the same morphosyntactic category ...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
LX-Rare Word Similarity Dataset

The LX-Rare Word Similarity Data set was created from Stanford Rare Word (RW) Similarity data set (Luong et al., 2013). This list contains 2 034 words (1 017 pairs of words). All the words were extracted from Wikipedia and from WordNet (Miller, 1995), a lexical database where the concepts are gro...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
LX-LR4DistSemEval

A collection of language resources for the evaluation of distributional semantic models of Portuguese: LX-SimLex-999: http://metashare.metanet4u.eu/go2/lx-simlex-999 LX-Rare Word Similarity Data set: http://metashare.metanet4u.eu/go2/lx-rare-word-similarity-dataset LX-WordSim-353: h...

Resource Type:Corpus
Media Type:Text
Language:Portuguese

Order by:

Filter by: