MalToBi/SPAN Corpus

Audio corpus: 8 subfolders with .wav files Each containing : • 2 sound files containing a read story (“The sun and the wind”, each by speaker A and speaker B) • 2 sound files containing each 30 read sentences (each by speaker A and speaker B) • 2 x each of the 30 sentences as a single sound f...

Resource Type:Corpus
Media Type:Audio
Language:Maltese
F_Mona_1/ Spoken Newspaper

108 WAV files of spoken Maltese newspaper texts, subdivided into 12 directories with a variable number of sentences (sometimes: clauses) each. They come together with transcriptions and tables of phoneme durations.

Resource Type:Corpus
Media Type:Audio
Language:Maltese
Maltese Speech Engine Database

Description

Resource Type:Corpus
Media Types:Text
Audio
Language:Maltese
Maltese automatically produced distributional thesaurus

This is an automatically produced distributional thesaurus, which finds words that tend to occur in similar contexts as the target word. It is not a manually constructed thesaurus of synonyms. It was produced by Lexical Computing Ltd on the basis of the MLRS corpus. The file contains lines with...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
MALTESE AUTOMATIC COLLOCATIONS DICTIONARY

Maltese Automatic Collocations Dictionary =========================================== Lexical Computing Limited, October 2012 This is an Automatic Collocations Dictionary produced by Lexical Computing Limited, for delivery to the EU CESAR project. The method is • Take a corpus of the l...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
MLSS Sentence Splitter Web Service

The MLSS Sentence Splitter is a web service tool, which takes text as input and outputs the identified sentences surrounded by tags. The tool was tuned for Maltese. The download for this resource only contains the narrative description in a Word file. The web service has one methods which can ...

Resource Type:Tool / Service
Language:Maltese
Maltese Wordlist

Wordlist for spell-checking

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
COVID-19 ANTIBIOTIC dataset. Multilingual (CEF languages)

Multilingual (CEF languages) corpus acquired from the website https://antibiotic.ecdc.europa.eu/ . It contains 20981 TUs (in total) for EN-X language pairs, where X is a CEF language.

Resource Type:Corpus
Media Type:Text
Languages:Bokmål, Norwegian; Norwegian Bokmål
Bulgarian
Croatian
Czech
Danish
Dutch; Flemish
English
Estonian
Finnish
French
German
Greek, Modern (1453-)
Hungarian
Icelandic
Irish
Italian
Latvian
Lithuanian
Maltese
Moldavian; Moldovan
Polish
Portuguese
Romanian
Slovak
Slovenian
Spanish; Castilian
Swedish
COVID-19 EU presscorner v2 dataset. Multilingual (CEF languages)

Multilingual (CEF languages) corpus acquired from website (https://ec.europa.eu/commission/presscorner/) of the EU portal (8th July 2020). It contains 23 TMX files (EN-X, where X is a CEF language) with 151895 TUs in total.

Resource Type:Corpus
Media Type:Text
Languages:Bulgarian
Croatian
Czech
Danish
Dutch; Flemish
English
Estonian
Finnish
French
German
Greek, Modern (1453-)
Hungarian
Irish
Italian
Latvian
Lithuanian
Maltese
Moldavian; Moldovan
Polish
Portuguese
Romanian
Slovak
Slovenian
Spanish; Castilian
Swedish
COVID-19 EUR-LEX dataset . Multilingual (CEF languages)

Multilingual (CEF languages) corpus acquired from website (https://eur-lex.europa.eu/legal-content) of the EU portal (9th July 2020). It contains 23 TMX files (EN-X, X is a CEF language) with 475,931 translation units pairs in total.

Resource Type:Corpus
Media Type:Text
Languages:Bulgarian
Croatian
Czech
Danish
Dutch; Flemish
English
Estonian
Finnish
French
German
Greek, Modern (1453-)
Hungarian
Irish
Italian
Latvian
Lithuanian
Maltese
Moldavian; Moldovan
Polish
Portuguese
Romanian
Slovak
Slovenian
Spanish; Castilian
Swedish

Order by: