The CIEMPIESS Proper-Names Pronouncing Dictionary

Transcriptions in the CIEMPIESS-PNPD are based on a phonetic alphabet called Mexbet. Mexbet was design for the Spanish of Central Mexico and it has several levels of granularity. The CIEMPIESS-PNPD comes in two versions: Mexbet T29 and Mexbet T66. Level T29 of Mexbet means that transcriptions ...

Resource Type:Corpus
Media Type:Text
Language:Spanish; Castilian
PS corpus (Post-Scriptum)-ES

PS Corpus (Post-Scriptum)-ES is a corpus of 2368 informal mail letters written in Spanish during the Modern Ages (from the XVIth century to the beginning of the XIXth century). Each letter is available as a semi-palaeographic transcription, a modernized transcription, and with part-of-speech a...

Resource Type:Corpus
Media Type:Text
Language:Spanish; Castilian
Alignment of Parallel Texts from Cyrillic to Latin

The text of the novel Sania (eng. The Sledge) served as a training corpus. It was written in 1955 by Ion Druță and printed originally in Cyrillic scripts. We have followed a special previously developed technology of recognition and specialized lexicons. In such a way, we have obtained the electr...

Resource Type:Corpus
Media Type:Text
Language:Romanian
PsychAnaphora - Types of anaphora produced in a sentence completion task

This set of materials pertains to a study on the production of explicit pronouns, null pronouns, and repeated-NP anaphors, in European Portuguese. A spreadsheet containing data from 73 participants (young adults), namely, count data for instances of the different types of anaphor that occurred in...

Resource Type:Language Description
Media Type:Text
Language:Portuguese
Embeddings for Comparative Probing of Lexical Semantics Theories

Embeddings used in: Branco, António, João Rodrigues, Małgorzata Salawa, Ruben Branco and Chakaveh Saedi, 2020. Comparative Probing of Lexical Semantics Theories for Cognitive Plausibility and Technological Usefulness. In Proceedings of the International Conference on Computational Linguistics (C...

Resource Type:Lexical / Conceptual
Media Type:Text
Language:Portuguese
PsychAnaphora - Event related brain potentials from young and older adults

This set of materials pertains to a study on the processing of explicit pronouns in European Portuguese. Forty spreadsheets containing Event Related Potentials, encoded as voltage variations across 64 electrodes during 1.5 s, in two millisecond steps, are provided, 20 of which pertain to younger ...

Resource Type:Language Description
Media Type:Text
Language:Portuguese
Arquivo Dialetal CLUP - Orthographic and phonetic transcription

Arquivo Dialetal CLUP - ORTH is a speech corpus approximately with 40 000 tokens (Utterances; spontaneous speech, mainly from Northern Portugal). Orthographic and phonetic transcription.

Resource Type:Corpus
Media Type:Audio
Language:Portuguese
CRPC Discourse Bank v1.0

The CRPC Discourse Bank is labeled for discourse relations (also referred to as rhetorical relations or coher- ence relations), such as cause and condition, that hold between two spans of text and contribute to ensure the overall cohesion and coherence of the text. The scheme follows the principl...

Resource Type:Corpus
Media Type:Text
Language:Portuguese

Order by:

Filter by: