↪️

Enhanced Retrieval

Memories in our brains are not stable, they are reconsolidated and shift in importance as they are retrieved from the brain.

What you had for lunch 4 days ago might be hard to recall right now.

But, if you just had a discussion with a friend about that lunch, you are much more likely to remember it. It might even pop up in your head while you are thinking of something related.

Current methods of RAG, especially in the AI agent space, try to mimic the process of retrieving related information through algorithms such as cosine similarity & euclidean distance (good explanation here).

However, once encoded by an embedding system, these embeddings are static and stored in some sort of a vector database (I personally like pinecone). This is not what happens in our brains; memories shift in importance based on strength & recency.

What if we correct this through a simple change?

Proposal

Let’s add some value $\epsilon$ to each embedding vector that represents the strength of the memory. We will incorporate this value into an function such as cosine similarity to boost certain memories in priority.

In order to control $\epsilon$ , we should calculate it using the mean and standard deviation of vector A. Here is how to generate this value in python:

def generate_strength(a, alpha=0.5, beta=0.5):
    # ε = α * (mean(A) + std(A))
    epsilon = alpha * (mean(a) + (beta * std(a)))

    return epsilon

Here, Alpha and Beta values are just dials to control how strong the bias should be.

Alpha controls the whole strength value and Beta controls the influence of the standard deviation on the strength value.

Normal Cosine Similarity

Now, we will use this in a cosine similarity function. This is how the cosine similarity function looks like.

\text{cosine similarity} = \frac{\mathbf{A} \cdot \mathbf{B}}{\|\mathbf{A}\| \|\mathbf{B}\|} = \frac{\sum_{i=1}^{n} A_i B_i}{\sqrt{\sum_{i=1}^{n} A_i^2} \sqrt{\sum_{i=1}^{n} B_i^2}}

Implementation in Python:

def get_cosine_normal(a, b):
    cos_sim = dot(a_values, b_values)/(norm(a_values)*norm(b_values))
    return cos_sim

Enhanced Cosine Similarity

To this, we can then add the strength value $\epsilon$ that we calculated earlier:

\text{cosine similarity} = \frac{(\mathbf{A}+\epsilon_a) \cdot (\mathbf{B+\epsilon_b})}{\|\mathbf{A}\| \|\mathbf{B}\|}

Implementation in Python:

def get_cosine(a: tuple, b: tuple):
    a_values = a[0]
    b_values = b[0]

    a_bias = a[1]
    b_bias = b[1]

    cos_sim = dot(a_values + a_bias, b_values + b_bias)/(norm(a_values)*norm(b_values))
    return cos_sim

Notice how we now represent memories in tuples (embedding, strength).

Proposal Conclusion

Mathematically, higher values of $\epsilon_a$ should increase the cosine similarity value of vector A against any other vectors, meaning it will be prioritized in embedding retrieval. This is more in accordance with how our brains record and retrieve memories; we are more likely to remember certain things than others. That’s why there are things like traumas and obsession.

This method of enhancing retrieval techniques is also be useful in the real world outside of trying to play Frankenstein. E.g. Perplexity can strengthen certain reliable websites to be retrieved more often than others which might lack in quality (cough Reddit posts cough).

Testing

Here is a test that I did with these two strings:

text1 = "This is a foo bar sentence ."
text2 = "This is a foo bar close to text 1"

Here are the results:

You can try it yourself (use your own key!):

import openai
import numpy as np
from numpy import dot, mean, std
from numpy.linalg import norm

client = openai.OpenAI(api_key='key-here')

def get_cosine_normal(a, b):
    cos_sim = dot(a, b)/(norm(a)*norm(b))
    return cos_sim

def get_cosine_enhanced(a: tuple, b: tuple):
    a_values = a[0]
    b_values = b[0]

    a_bias = a[1]
    b_bias = b[1]

    cos_sim = dot(a_values + a_bias, b_values + b_bias)/(norm(a_values)*norm(b_values))
    return cos_sim

def generate_strength(a, alpha=0.5, beta=0.5):
    # ε = α * (mean(A) + std(A))
    epsilon = alpha * (mean(a) + (beta * std(a)))

    return epsilon

def text_to_vector(text):
    response = client.embeddings.create(
        input=text,
        model="text-embedding-3-small"
    )

    return response.data[0].embedding

text1 = "This is a foo bar sentence ."
text2 = "This is a foo bar close to text 1"

vector1 = text_to_vector(text1)
vector2 = text_to_vector(text2)

epsilon_a = generate_strength(vector1, 0.5, 0.5)
epsilon_b = generate_strength(vector2, 0.5, 0.5)

a = (vector1, epsilon_a)
b = (vector2, epsilon_b)

cosine = get_cosine_normal(vector1, vector2)
enhanced_cosine = get_cosine_enhanced(a, b)

print("Normal Cosine:", cosine)
print("Enhanced Cosine:", enhanced_cosine)