Extracting keywords from texts has become a challenge for individuals and
organizations as the information grows in complexity and size. The need to
automate this task so that texts can be processed in a timely and adequate
manner has led to the emergence of automatic keyword extraction tools. Despite
the advances, there is a clear lack of multilingual online tools to
automatically extract keywords from single documents.
Yake! is a novel feature-based system for multi-lingual keyword extraction, which
supports texts of different sizes, domain or languages. Unlike most of the
systems, Yake! does not rely on dictionaries nor thesauri, neither is trained
against any corpora. Instead, we follow an unsupervised approach which builds
upon features extracted from the text, making it thus applicable to documents
written in different languages without the need for further knowledge. This can
be beneficial for a large number of tasks and a plethora of situations where
the access to training corpora is either limited or restricted.
Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018).
"A Text Feature Based Automatic Keyword Extraction Method for Single Documents".
In Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds).
Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 684 - 691. ECIR'18 Best Short Paper
Campos R., Mangaravite V., Pasquali A., Jorge A.M., Nunes C., and Jatowt A. (2018).
"YAKE! Collection-independent Automatic Keyword Extractor".
In Pasi G., Piwowarski B., Azzopardi L., Hanbury A. (eds).
Advances in Information Retrieval. ECIR 2018 (Grenoble, France. March 26 – 29). Lecture Notes in Computer Science, vol 10772, pp. 806 - 810.
No fee, attribution, all rights reserved, no redistribution, non commercial, no
warranty, no liability, no endorsement, temporary, non exclusive, share alike.