BioLexicon

The BioLexicon is a large-scale, wide-coverage computational lexicon covering the biomedical domain. A large part of the lexicon is concerned with covering biomedical terms and their variants. Entries for domain-specific verbs include syntactic and semantic information. The lexicon includes entri...

Resource Type:Corpus
Media Type:Text
Language:English
Blacklist Classifier

A language identifier for closely related languages.

Resource Type:Tool / Service
Languages:Bosnian
Croatian
Czech
Portuguese
Serbian
Slovak
BMI Brochures 2011-2015 (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. English translations of German BMI brochures from the la...

Resource Type:Corpus
Media Type:Text
Languages:English
German
BMI Brochures and Website 2016 (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Bilingual tmx file of German to English translations of ...

Resource Type:Corpus
Media Type:Text
Languages:English
German
BMVI Publications (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. TMX file with 11555 TUs, bilingual German/English, publi...

Resource Type:Corpus
Media Type:Text
Languages:English
German
BMVI Website (Processed)

This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. tmx file, 2718 TUs, bilingual German/English, texts from...

Resource Type:Corpus
Media Type:Text
Languages:English
German
Brands.Br – a Portuguese Reviews Corpus

The Brands.Br corpus was built from a fraction of B2W-Reviews01 corpus. We use a set of 252 samples selected by B2W to be enriched. In Brands.Br corpus we want to solve two main challenges in product reviews corpus. The first: it is very common to find customer reviews referring to distinct thing...

Resource Type:Corpus
Media Type:Text
Language:Portuguese
Bulgarian-English Wikipedia WSD/NED corpus

Bulgarian-English Wikipedia WSD/NED corpus is composed of articles from the Bulgarian version of Wikipedia and their English counterparts.

Resource Type:Corpus
Media Type:Text
Languages:Bulgarian
English
Burst-Annotated Co-Occurrence Network for the Arab Spring Domain

A burst-annotated co-occurrence network about the Arab Spring topic built on the top of New York Times article snapshots from the years 2010-2013.

Resource Type:Corpus
Media Type:Text
Language:American English
Carolina: General Corpus of Contemporary Brazilian Portuguese with provenance and typology information

Carolina is an open corpus for Linguistics and Artificial Intelligence with a robust volume of texts of varied typology in contemporary Brazilian Portuguese (1970-2021).

Resource Type:Corpus
Media Type:Text
Language:Brazilian Portuguese

Order by:

Filter by:

Text (444)
Audio (18)
Image (1)