Vector Similarity Visualization¶

Run the Vector Similarity Visualization Fullscreen

About This MicroSim¶

This visualization demonstrates how word embeddings capture semantic relationships. Words with similar meanings cluster together in the embedding space, and cosine similarity measures how closely related two words are.

Iframe Embedding¶

<iframe src="https://dmccreary.github.io/Digital-Transformation-with-AI-Spring-2026/sims/vector-similarity/main.html"
        height="652px"
        width="100%"
        scrolling="no">
</iframe>

How to Use¶

Explore Clusters: Notice how semantically related words cluster together
Click Two Words: Select any two words to calculate their similarity
Compare Metrics: View cosine similarity and Euclidean distance
Test Hypotheses: Try words from same vs. different categories

Understanding Word Embeddings¶

Concept	Description
Embedding	Dense vector representation of a word
Dimension	Number of values in the vector (typically 300-1536)
Cosine Similarity	Measure of angle between vectors (0-1)
Semantic Space	Geometric space where meaning is encoded

Cosine Similarity¶

Cosine similarity measures the angle between two vectors:

\[\text{similarity} = \cos(\theta) = \frac{\mathbf{A} \cdot \mathbf{B}}{|\mathbf{A}| |\mathbf{B}|}\]

Value	Interpretation
0.8 - 1.0	Very similar (synonyms, same category)
0.6 - 0.8	Related concepts
0.4 - 0.6	Loosely related
0.0 - 0.4	Unrelated or opposite

Why Semantic Search Outperforms Keyword Matching¶

Keyword Search	Semantic Search
Requires exact word match	Finds conceptually similar content
"car" won't find "automobile"	"car" finds "automobile", "vehicle"
Fails with synonyms	Understands synonymy
No context understanding	Captures meaning

Learning Objectives¶

After using this tool, students should be able to:

Understand (Bloom's L2): Explain how vector similarity captures semantic relationships
Apply (Bloom's L3): Interpret cosine similarity values
Analyze (Bloom's L4): Compare semantic search with keyword matching

Lesson Plan¶

Activity 1: Cluster Analysis (10 minutes)¶

Identify the 5 semantic clusters in the visualization
Predict which words will have highest similarity
Test your predictions by clicking word pairs

Activity 2: Cross-Category Comparison (15 minutes)¶

Find the highest similarity between words in DIFFERENT categories
Find the lowest similarity between words in the SAME category
Explain the results

Discussion Questions¶

Why do words in the same category have higher similarity?
What business problems can semantic search solve that keyword search cannot?
How does embedding quality affect RAG system performance?

Applications in RAG Systems¶

Component	Role of Embeddings
Document Chunking	Split documents into embeddable segments
Vector Storage	Store embeddings in vector database
Query Embedding	Convert user query to same vector space
Retrieval	Find chunks with highest similarity to query
Context Assembly	Provide relevant chunks to LLM

Chapter 5: Custom GPTs, Agents, and RAG Systems
Vector Database
Retrieval Augmented Generation
Embedding Models

References¶

Mikolov, T., et al. (2013). Efficient Estimation of Word Representations in Vector Space. ICLR.
Pennington, J., Socher, R., & Manning, C. (2014). GloVe: Global Vectors for Word Representation. EMNLP.
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. EMNLP.

Self-Assessment Quiz¶

Test your understanding of vector similarity and word embeddings.

Question 1: What is a word embedding?

A physical object embedded in text
A dense numerical vector that represents the meaning of a word
A type of font style
A grammar checking tool

Answer

B) A dense numerical vector that represents the meaning of a word - Word embeddings convert words into multi-dimensional vectors where semantic relationships are preserved as geometric relationships.

Question 2: What does cosine similarity measure?

The physical distance between two objects
The angle between two vectors, indicating how similar their directions are
The size of two vectors
The color difference between vectors

Answer

B) The angle between two vectors, indicating how similar their directions are - Cosine similarity measures the cosine of the angle between vectors, with values closer to 1 indicating more similar meanings.

Question 3: In a well-trained embedding space, what happens to words with similar meanings?

They are placed far apart
They cluster together in the vector space
They are deleted
They become identical

Answer

B) They cluster together in the vector space - Words with similar meanings (like "car" and "automobile") are positioned near each other in the embedding space.

Question 4: Why does semantic search outperform keyword matching?

Semantic search is always faster
Semantic search finds conceptually similar content even without exact word matches
Keyword matching is illegal
Semantic search uses less computing power

Answer

B) Semantic search finds conceptually similar content even without exact word matches - Semantic search using embeddings can find documents about "automobiles" when searching for "cars" because it understands meaning, not just word presence.

Question 5: How are vector embeddings used in RAG (Retrieval Augmented Generation) systems?

They are not used in RAG
They enable finding relevant document chunks based on semantic similarity to user queries
They replace the language model
They generate random content

Answer

B) They enable finding relevant document chunks based on semantic similarity to user queries - RAG systems embed both documents and queries into the same vector space, then retrieve chunks with high similarity to provide relevant context to the LLM.

Vector Similarity Visualization¶

About This MicroSim¶

Iframe Embedding¶

How to Use¶

Understanding Word Embeddings¶

Cosine Similarity¶

Why Semantic Search Outperforms Keyword Matching¶

Learning Objectives¶

Lesson Plan¶

Activity 1: Cluster Analysis (10 minutes)¶

Activity 2: Cross-Category Comparison (15 minutes)¶

Discussion Questions¶

Applications in RAG Systems¶

Related Concepts¶

References¶

Self-Assessment Quiz¶