The texts are sentences from the News parallel corpus. The texts contain monolingual sentences from parallel corpora for the following pairs: Basque-English, Bulgarian-English, Czech-English, Portuguese-English and Spanish-English. The English corpus is comprised by the English side of the Spanis...
HuggingFace (pytorch) pre-trained roBERTa model in Portuguese, with 6 layers and 12 attention-heads, totaling 68M parameters. Pre-training was done on 10 million Portuguese sentences and 10 million English sentences from the Oscar corpus. Please cite: Santos, Rodrigo, João Rodrigues, Antóni...
Economical Crisis terms
Terms that have (more or less) recently been accepted and normalised by Termcat, mixed fields
Terms for Digital Marketing
Terms of Social Webs
Terms of Research Thesaurus
Terms for Fairs and Congresses
Industry terms
Terms of Exotic Wood