This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. A collection of Polish-English translations of the Polis...
This dataset has been created within the framework of the European Language Resource Coordination (ELRC) Connecting Europe Facility - Automated Translation (CEF.AT) action. For further information on the project: http://lr-coordination.eu. Polish Food is a quarterly issued by the Polish Ministry...
UEvora Tagger is a freely available on-line service for tagging sentences written in Portuguese. This service was developed and is maintained at the University of Évora by the VISTA - Video, Image, Speech, and Text Analysis Group of the Department of Informatics.
Multilingual (CEF languages) corpus acquired from website (https://ec.europa.eu/commission/presscorner/) of the EU portal (14th May 2020). It contains 23 TMX files (EN-X, where X is a CEF language) with 83217 TUs in total.
Bilingual (EN-PT) corpus acquired from website (https://ec.europa.eu/commission/presscorner/) of the EU portal (8th July 2020).
Bilingual (EN-PT) corpus acquired from website (https://ec.europa.eu/*coronavirus-response) of the EU portal (20th May 2020).
The News corpus developed by LIACC in JSON format was complemented with POS and keyword topics annotation. POS-tagging =========== The POS-tagging used the tagger described in Généreux et al. (2012) The title and text body were extracted, tokenized and pos-tagged. Two new fields were added...
A NER-classifier based on memory-based learning, trained on the CINTIL dataset, a corpus that contains part of the Corpus de Referência do Português Contemporâneo - CRPC (Reference Corpus of Contemporary Portuguese). https://portulanclarin.net/repository/browse/cintil-corpus-internacional-do-por...
LXService is a Web Service that consists in a range of tools for Portuguese that have been develop for the processing of Portuguese. They were selected because they satisfy a number of features that are likely to make them more suitable for initial experimentation: They are fast, robust, the ling...
Bilingual (EN-PT) corpus acquired from the website (https://www.europarl.europa.eu/) of the European Parliament (25th April 2020)