Memories in our brains are not stable, they are reconsolidated and shift in importance as they are retrieved from the brain.
What you had for lunch 4 days ago might be hard to recall right now.
But, if you just had a discussion with a friend about that lunch, you are much more likely to remember it. It might even pop up in your head while you are thinking of something related.
Current methods of RAG, especially in the AI agent space, try to mimic the process of retrieving related information through algorithms such as cosine similarity & euclidean distance (good explanation here).
However, once encoded by an embedding system, these embeddings are static and stored in some sort of a vector database (I personally like pinecone). This is not what happens in our brains; memories shift in importance based on strength & recency.
What if we correct this through a simple change?
Proposal
Let’s add some value to each embedding vector that represents the strength of the memory. We will incorporate this value into an function such as cosine similarity to boost certain memories in priority.
In order to control , we should calculate it using the mean and standard deviation of vector A. Here is how to generate this value in python:
def generate_strength(a, alpha=0.5, beta=0.5):
# ε = α * (mean(A) + std(A))
epsilon = alpha * (mean(a) + (beta * std(a)))
return epsilon
Here, Alpha and Beta values are just dials to control how strong the bias should be.
Alpha controls the whole strength value and Beta controls the influence of the standard deviation on the strength value.
Normal Cosine Similarity
Now, we will use this in a cosine similarity function. This is how the cosine similarity function looks like.
Implementation in Python:
def get_cosine_normal(a, b):
cos_sim = dot(a_values, b_values)/(norm(a_values)*norm(b_values))
return cos_sim
Enhanced Cosine Similarity
To this, we can then add the strength value that we calculated earlier:
Implementation in Python:
def get_cosine(a: tuple, b: tuple):
a_values = a[0]
b_values = b[0]
a_bias = a[1]
b_bias = b[1]
cos_sim = dot(a_values + a_bias, b_values + b_bias)/(norm(a_values)*norm(b_values))
return cos_sim
Notice how we now represent memories in tuples (embedding, strength).
Proposal Conclusion
Mathematically, higher values of should increase the cosine similarity value of vector A against any other vectors, meaning it will be prioritized in embedding retrieval. This is more in accordance with how our brains record and retrieve memories; we are more likely to remember certain things than others. That’s why there are things like traumas and obsession.
This method of enhancing retrieval techniques is also be useful in the real world outside of trying to play Frankenstein. E.g. Perplexity can strengthen certain reliable websites to be retrieved more often than others which might lack in quality (cough Reddit posts cough).
Testing
Here is a test that I did with these two strings:
text1 = "This is a foo bar sentence ."
text2 = "This is a foo bar close to text 1"
Here are the results:
You can try it yourself (use your own key!):
import openai
import numpy as np
from numpy import dot, mean, std
from numpy.linalg import norm
client = openai.OpenAI(api_key='key-here')
def get_cosine_normal(a, b):
cos_sim = dot(a, b)/(norm(a)*norm(b))
return cos_sim
def get_cosine_enhanced(a: tuple, b: tuple):
a_values = a[0]
b_values = b[0]
a_bias = a[1]
b_bias = b[1]
cos_sim = dot(a_values + a_bias, b_values + b_bias)/(norm(a_values)*norm(b_values))
return cos_sim
def generate_strength(a, alpha=0.5, beta=0.5):
# ε = α * (mean(A) + std(A))
epsilon = alpha * (mean(a) + (beta * std(a)))
return epsilon
def text_to_vector(text):
response = client.embeddings.create(
input=text,
model="text-embedding-3-small"
)
return response.data[0].embedding
text1 = "This is a foo bar sentence ."
text2 = "This is a foo bar close to text 1"
vector1 = text_to_vector(text1)
vector2 = text_to_vector(text2)
epsilon_a = generate_strength(vector1, 0.5, 0.5)
epsilon_b = generate_strength(vector2, 0.5, 0.5)
a = (vector1, epsilon_a)
b = (vector2, epsilon_b)
cosine = get_cosine_normal(vector1, vector2)
enhanced_cosine = get_cosine_enhanced(a, b)
print("Normal Cosine:", cosine)
print("Enhanced Cosine:", enhanced_cosine)