MLRS Corpus

142,397 Maltese texts from 10 genres. The file “corpus.zip” expands into a folder “corpus”, containing the file “tagged.zip”, which expands into the folder “cwb.final”. This folder contains the files: • filelist.txt • malti02.academic.txt • malti02.law.txt • malti02.literature.txt • malti...

Resource Type:Corpus
Media Type:Text
Language:Maltese
Maltese Wiktionary

This lexicon is part of the collection of the Wikimedia Dumps which was retrieved as an XML file from http://dumps.wikimedia.org/mtwiktionary/20121105/ on November 5, 2012. In the Wikimedia dump, it is accompanied by a text file mtwiktionary-20121105-pages-articles-multistream-index.txt which li...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
Maltese automatically produced distributional thesaurus

This is an automatically produced distributional thesaurus, which finds words that tend to occur in similar contexts as the target word. It is not a manually constructed thesaurus of synonyms. It was produced by Lexical Computing Ltd on the basis of the MLRS corpus. The file contains lines with...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
MLSS Tokeniser Web Service

The web service is a tool which takes text as input and returns a list of tokens. The tokens can be orthographical words, numerals and punctuation marks. The tokeniser was designed to work on Maltese texts. The download for this resource only contains the narrative description in a Word file. ...

Resource Type:Tool / Service
Language:Maltese
MLSS Paragraph Splitter Web service

The paragraph splitter is a web service tool which takes text as input and outputs the identified paragraphs surrounded by tags. The tool is language independent. The download for this resource only contains the narrative description in a Word file. The service has one method which can be invo...

Resource Type:Tool / Service
Language:Maltese
MLSS Tagger Web Service

The part of speech tagger for Maltese is based on TnT, the statistical part of speech tagger by Thorsten Brants (http://www.coli.uni-saarland.de/~thorsten/tnt/). It was modified for the Maltese Language Resource Server (MLRS) by Albert Gatt (Linguistics Department, University of Malta). The mode...

Resource Type:Tool / Service
Language:Maltese
COVID-19 EUR-LEX dataset . Multilingual (CEF languages)

Multilingual (CEF languages) corpus acquired from website (https://eur-lex.europa.eu/legal-content) of the EU portal (9th July 2020). It contains 23 TMX files (EN-X, X is a CEF language) with 475,931 translation units pairs in total.

Resource Type:Corpus
Media Type:Text
Languages:Bulgarian
Croatian
Czech
Danish
Dutch; Flemish
English
Estonian
Finnish
French
German
Greek, Modern (1453-)
Hungarian
Irish
Italian
Latvian
Lithuanian
Maltese
Moldavian; Moldovan
Polish
Portuguese
Romanian
Slovak
Slovenian
Spanish; Castilian
Swedish
Maltese Acquis Communautaire

This is the Maltese version of the Acquis Communautaire (AC), which is the total body of European Union (EU) law applicable in the EU Member States. It consists of selected texts between the 1950s and today, translated to Maltese.

Resource Type:Corpus
Media Type:Text
Language:Maltese
Laws of Malta - Maltese

The corpus contains the Laws of Malta in Maltese from the official government website. The unannotated raw text files were extracted from the pdf files that can be found on the website.

Resource Type:Corpus
Media Type:Text
Language:Maltese
MFSA_Maltese_Company_Registry

List of companies with further information

Resource Type:Lexical / Conceptual
Media Type:Text
Languages:English
Maltese

Order by: