Radio Bulgaria WSD/NED corpus is composed of texts from Bulgarian and English articles from the website of Radio Bulgaria.
GREC is a semantically annotated corpus of 240 MEDLINE abstracts (167 on the subject of E. coli species and 73 on the subject of the Human species) which is intended for training IE systems and/or resources which are used to extract events from biomedical literature.
Hesita-POS is an annotaded corpus. Tv News.
The LX-Battig was created from Battig test.set (Baroni et al., 2010). This data set has 83 concrete concepts of the following 10 categories: mammals, birds, fish, vegetables, fruit, trees, vehicles, clothes, tools and kitchenware. The categories names and the concepts were translated by two trans...
Filter by:
English (238)
Portuguese (193)
Spanish; Castilian (44)
German (37)
Polish (37)
French (36)
Maltese (32)
Bulgarian (29)
Basque (24)
Czech (24)
Romanian (24)
Italian (21)
Croatian (19)
Estonian (17)
Dutch; Flemish (15)
Swedish (15)
Finnish (14)
Latvian (14)
Danish (12)
Slovak (11)
Galician (10)
Irish (9)
Lithuanian (9)
Slovenian (9)
Hungarian (7)
Arabic (4)
American English (3)
Catalan (3)
Chinese (3)
Spanish (3)
Latin (2)
Afrikaans (1)
Bosnian (1)
Central Kanuri (1)
Central Khmer (1)
Dutch (1)
Hausa (1)
Hindi (1)
Icelandic (1)
Mandarin Chinese (1)
Russian (1)
Serbian (1)
Slovene (1)
Swahili (1)
Tamashek (1)
Thai (1)
Turkish (1)
Urdu (1)
Vietnamese (1)
Welsh (1)
Legalese (4)
Standard Maltese (1)
1810-1940 (1)
1840 -2013 (1)
1970 -2002 (1)
1970-1975 (1)
1970-2000 (1)
1970-2001 (1)
1970-2002 (1)
1970-today (1)
1971-1977 (1)
1974-2004 (1)
1986 -1987 (1)
1996-1997 (1)
1996-2011 (1)
2001 (1)
2003 (1)
Until 2006 (1)
Written Language (246)
Spoken Language (12)
LAW (23)
SOCIAL QUESTIONS (20)
Social Questions (15)
FINANCE (12)
POLITICS (9)
General (9)
News (8)
TRADE (7)
ECONOMICS (6)
Novels (6)
TRANSPORT (6)
Test Suite (6)
SOCIAL QUESTIONS (4)
Law_politics (4)
AGRI-FOODSTUFFS (3)
ENERGY (3)
ENVIRONMENT (3)
EUROPEAN UNION (2)
INDUSTRY (2)
Political (2)
TRADE (2)
AGRI-FOODSTUFFS (1)
Accomodation (1)
Biodiversity (1)
ECONOMICS (1)
ECONOMi CS (1)
FINANCE (1)
Fiction (1)
General (1)
Geographic (1)
HEALTH (1)
INDUSTRY (1)
LAW (1)
Medical History (1)
News articles (1)
POLITICS (1)
SCIENCE (1)
Science (1)
Science (1)
Human Use (14)
Text Mining (37)
Pos Tagging (30)
Parsing (22)
Lemmatization (13)
Linguistic Research (13)
Other (11)
Annotation (10)
Web Services (10)
Lexicon Access (7)
Event Extraction (6)
Summarisation (4)
Speech Analysis (3)
Semantic Web (2)
Speech Synthesis (2)
Text Generation (1)
TMX (18)
Text/xml (16)
Plain text (13)
Wav (3)
Application/pdf (2)
Application/xml (2)
Xml (2)
Application/rtf (1)
Audio/wav (1)
Sgml (1)
Text/html (1)
Text/plain (1)