Arquivo Dialetal CLUP - ORTH is a speech corpus approximately with 40 000 tokens (Utterances; spontaneous speech, mainly from Northern Portugal). Orthographic and phonetic transcription.
This is a UIMA component that provides a visualization of speech based output from UIMA workflows. It has been developed at the University of Manchester, using libraries of the Java Speech Toollkit (jstk). It has been designed specifically for use with the U-Compare text mining workbench (see sep...
Audio corpus: 8 subfolders with .wav files Each containing : • 2 sound files containing a read story (“The sun and the wind”, each by speaker A and speaker B) • 2 sound files containing each 30 read sentences (each by speaker A and speaker B) • 2 x each of the 30 sentences as a single sound f...
EmoProsodyPort (see Castro & Lima, 2010) is a speech database with 368 short sentences and pseudosentences with neutral emotional content. Acoustic measurements and behavioral data.
Arquivo Dialetal CLUP - POS is a speech corpus with approximately 40 000 tokens (Utterances; spontaneous speech, mainly from Northern Portugal). Orthographic transcription, POS.
EmoVoicePort, Emotional Vocalization Corpus (see Lima, Castro, & Scott, 2013) is a validated set of nonverbal vocalizations that portray four positive emotions (achievement/triumph, amusement, sensual pleasure, relief) and four negative ones (anger, disgust, fear, sadness). The vocalizations (n =...
Arquivo Dialetal CLUP - Áudio is an audio corpus of spontaneous speech, mainly from Northern Portugal.
CORAA NURC-SP Minimal Corpus is a manually annotated corpus of Brazilian Portuguese spontaneous speech (São Paulo variety). The corpus is a subset of NURC (‘Cultured Linguistic Urban Norm’) project collection, one of the most influential in Brazilian Linguistics. The corpus was brought to digital...
CORP-ORAL is a spontaneous speech corpus for European Portuguese. It is the main output of two R&D projects: CORP-ORAL and ORAL-PHON. The data consist of unscripted and unprompted face-to-face dialogues between family, friends, colleagues and unacquainted participants. All recordings are orthogra...
108 WAV files of spoken Maltese newspaper texts, subdivided into 12 directories with a variable number of sentences (sometimes: clauses) each. They come together with transcriptions and tables of phoneme durations.