LX Semantic Similarity
|https://hdl.handle.net/21.11129/0000-000E-5986-7 (persistent URL to this page)
LX Semantic Similarity is an online service for measuring the semantic similarity between words in Portuguese. This service uses the LX-DSemVectors, a distributional semantics model (a.k.a. word embeddings) of the Portuguese language.
The model represents each word in its vocabulary by a vector of real numbers. This vector representation allows to obtain a measure of similarity between two or more words, calculated by means of the cosine distance between the vectors of those words.
The online service provides two types of search:
- Calculating the semantic distance between a pair of words: By inserting two words, the service displays the distance between them and an interactive 2D plot (t-SNE embedding) with the 200 closest words to each of the input words.
- Displaying the semantic cloud of a word: By inserting a single word, the service shows a word cloud with the 20 most similar words (using a larger font size for words closer to the input word) and a table with the 20 most similar terms (and their similarity with respect to the input word).
This service was developed at the University of Lisbon Department of Informatics, by the NLX-Natural Language and Speech Group.