LX Semantic Similarity is an online service for measuring the semantic
similarity between words in Portuguese. This service uses the LX-DSemVectors, a distributional semantics model
(a.k.a. word embeddings) of the Portuguese language (also available from GitHub).
The model represents each word in its vocabulary by a vector of real numbers.
This vector representation allows to obtain a measure of similarity
between two or more words, calculated by means of the cosine distance
between the vectors of those words.
The online service provides two types of search:
Calculating the semantic distance between a pair of words: By inserting two words, the service displays the distance
between them and an interactive 2D plot (t-SNE embedding) with the 200
closest words to each of the input words.
Displaying the semantic cloud of a word: By inserting a single word, the service shows a word cloud
with the 20 most similar words (using a larger font size for words
closer to the input word) and a table with the 20 most similar terms
(and their similarity with respect to the input word).
No fee, attribution, all rights reserved, no redistribution, non commercial, no
warranty, no liability, no endorsement, temporary, non exclusive, share alike.