Research Infrastructure for the Science and Technology of Language

Repository
Workbench
Models
Helpdesk
Outreach

en
- pt

Home
Workbench
YAKE

beta

Portuguese
Arabic
German
English
Spanish
Finnish
French
Italian
Dutch
Polish
Turkish

Language

Maximum number of words in a keyword

Visualization format

friendly

text

JSON

YAKE's Documentation

YAKE! Collection-independent Automatic Keyword Extractor

Extracting keywords from texts has become a challenge for individuals and organizations as the information grows in complexity and size. The need to automate this task so that texts can be processed in a timely and adequate manner has led to the emergence of automatic keyword extraction tools. Despite the advances, there is a clear lack of multilingual online tools to automatically extract keywords from single documents. Yake! is a novel feature-based system for multi-lingual keyword extraction, which supports texts of different sizes, domain or languages. Unlike most of the systems, Yake! does not rely on dictionaries nor thesauri, neither is trained against any corpora. Instead, we follow an unsupervised approach which builds upon features extracted from the text, making it thus applicable to documents written in different languages without the need for further knowledge. This can be beneficial for a large number of tasks and a plethora of situations where the access to training corpora is either limited or restricted.

Authorship

YAKE! has been developed at LIAAD-INESC TEC by:

Ricardo Campos
Vítor Mangaravite
Arian Pasquali
Alípio M. Jorge
Célia Nunes
Adam Jatowt

Publications

Please cite the following works when using YAKE:

Campos, R., Mangaravite, V., Pasquali, A., Jatowt, A., Jorge, A., Nunes, C. and Jatowt, A. (2020). "YAKE! Keyword Extraction from Single Documents using Multiple Local Features". In Information Sciences Journal. Elsevier, Vol 509, pp 257-289.
Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). "A Text Feature Based Automatic Keyword Extraction Method for Single Documents". In Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 684 - 691.
ECIR'18 Best Short Paper
Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018). "YAKE! Collection-independent Automatic Keyword Extractor". In Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds). Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 806 - 810.

Awards

ECIR'18 Best Short Paper.

Availability

The YAKE! source code is available for download from the PORTULAN CLARIN repository, and GitHub.

License

No fee, attribution, all rights reserved, no redistribution, non commercial, no warranty, no liability, no endorsement, temporary, non exclusive, share alike.

The complete text of this license is here.

Cookie usage

This site saves small pieces of text information (cookies) on your device to enhance user experience. You can disable cookies by changing the settings of your browser. By browsing this website you accept to store that information on your device. Dismiss

acknowledgements

CLARIN @ Videolectures

CLARIN @ Github

CLARIN @ LinkedIn

CLARIN @ YouTube