Lexicon of discourse markers for European Portuguese

The lexicon of discourse markers for European Portuguese contains 252 pairs of discourse marker/rhetorical sense. The lexicon covers conjunctions, prepositions, adverbs, adverbial phrases and alternative lexicalizations with a connective function, as in the PDTB (Prasad et al., 2008; Prasad et al...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
LEX-MWE-PT: Word Combination in Portuguese Language

This lexicon includes multiword expressions (MWE) of European Portuguese extracted from a balanced 50,8M word written corpus – a subcorpus of the Reference Corpus of Contemporary Portuguese (CRPC). This corpus covers different genres, being mainly constituted by journalistic texts (59%), but it a...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
LX-Abbreviations

LX-Abbreviations resource is a collection of abbreviations of different types from European Portuguese composed by 208 words. Each type of abbreviation is manually divided and annotated with grammatical categories, gender and number, and, finally, with the respective abbreviations.

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
LX-DSemVectors

LX-DSemVectors is distributional lexical semantics model, also known as word embeddings, for Portuguese (Rodrigues et al., 2016). This version, 2.2b, was trained on a corpus of 2 billion tokens and achieved state-of-the-art results on multiple lexical semantic tasks (Rodrigues & Branco, 2018). ...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
LX-Stopwords

LX-Stopwords resource is a manual list of words from Portuguese composed by 2631 words of 51 types. The words are grouped in three big classes, arranged according to their morpho-syntactic category and inflectional feature value (closed classes, open classes, and multi-word units). This list was ...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
Maltese automatically produced distributional thesaurus

This is an automatically produced distributional thesaurus, which finds words that tend to occur in similar contexts as the target word. It is not a manually constructed thesaurus of synonyms. It was produced by Lexical Computing Ltd on the basis of the MLRS corpus. The file contains lines with...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
MALTESE AUTOMATIC COLLOCATIONS DICTIONARY

Maltese Automatic Collocations Dictionary =========================================== Lexical Computing Limited, October 2012 This is an Automatic Collocations Dictionary produced by Lexical Computing Limited, for delivery to the EU CESAR project. The method is • Take a corpus of the l...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
Maltese Fiction Wordlist

This is a wordlist which was created from 32 Maltese fiction books. These texts were originally in PDF file format and were converted to txt format. In the next step, the text file was tokenized and a frequency count was performed on the separate tokens. The resulting list (with about 50,000 entr...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
Maltese Speech Engine Lexicon

This lexicon is a speech lexicon, exported from Crimsonwing’s text-to-speech (TTS) database into a .txt file. In its original form and together with the Maltese Speech Engine Diphone repository, it was used for building Crimsonwing’s text-to-speech system. The file is in txt format, with each ...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese
Maltese Wiktionary

This lexicon is part of the collection of the Wikimedia Dumps which was retrieved as an XML file from http://dumps.wikimedia.org/mtwiktionary/20121105/ on November 5, 2012. In the Wikimedia dump, it is accompanied by a text file mtwiktionary-20121105-pages-articles-multistream-index.txt which li...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Maltese

Order by:

Filter by:

Text (59)
Image (1)