Getting, storing, and using LLM embeddings in a .NET App using sqlite

Oh no! Not yet another breathlessly gushing post about AI and LLMs … That’s right, this is not another post like that.

In this post I’m going to share how I generated embeddings for text, store those embeddings in sqlite, and then search them directly in plain simple .NET.

I’m going to show a complete working example. If you just want to see the code it is below.

This is the result … first I generate and store embeddings for a series of strings, and then I search those for a term:

 Generating embedding for Gold Discovered in California!
1 Generating embedding for The Great Fire of London Bridge
2 Generating embedding for First Steam-Powered Train Debuts
3 Generating embedding for The Telegraph Revolutionizes Communication
4 Generating embedding for The First Telephone Call by Alexander Graham Bell
5 Generating embedding for The Irish Potato Famine Strikes
6 Generating embedding for Darwin Publishes 'On the Origin of Species'
7 Generating embedding for The American Civil War Begins
8 Generating embedding for Eiffel Tower Construction Announced
9 Generating embedding for The Light Bulb Invented by Edison
Searching for "disaster" ...
Found: The Irish Potato Famine Strikes
Found: The Great Fire of London Bridge
Found: The Light Bulb Invented by Edison

I’m not claiming perfection - the first two results look good, but I’d rank “The American Civil War Begins” before “The Light Bulb Invented by Edison” … but maybe the machine knows something I don’t…

If you don’t know what an embedding is, then there are better people than me that can explain it … but what it comes down to is a mapping from text to an array of floats. You can store the embeddings in a sqlite vector database, and then search them by generating a vector for the search text, and finding embeddings that are close (cosine similarity: there’s a term you can use to impress people at parties).

First I’ll install the LLM that can be used to generate the embeddings, next I’ll set up sqlite to add a vector database extension, then I’ll generate and store a set of embeddings, and finally I’ll search.

But first, this is the high level code - we’ll fill in the functions below:

// Newspaper headlines from the 1800s
var headlines = new List<string>
{
    "Gold Discovered in California!",
    "The Great Fire of London Bridge",
    "First Steam-Powered Train Debuts",
    "The Telegraph Revolutionizes Communication",
    "The First Telephone Call by Alexander Graham Bell",
    "The Irish Potato Famine Strikes",
    "Darwin Publishes 'On the Origin of Species'",
    "The American Civil War Begins",
    "Eiffel Tower Construction Announced",
    "The Light Bulb Invented by Edison"
};

// Create and insert models
for (int i = 0; i < headlines.Count; i++)
{
  var text = headlines[i];
  Console.WriteLine($"{i} Generating embedding for {text}");
  var model = new Model { Title = text };
  await connection.InsertAsync(model);
  var embedding = await GenerateEmbedding(text);
  await AddEmbedding(connection, model, embedding);
}

// Now search
var searchTerm = "disaster";
Console.WriteLine($"Searching for \"{searchTerm}\" ...");
var searchEmbedding = await GenerateEmbedding(searchTerm);
var results = await SearchForEmbedding(connection, searchEmbedding, k: 3);
foreach (var result in results)
{
  Console.WriteLine($"Found: {result.Title}");
}

Installing the LLM

I like to use ollama - its available for both macOS and Windows. Once installed, you can then install models and access them from the command line, or via a local REST http service. Because they are installed and run locally, its a handy thing to have in general, since they work even when there is no network.

If you browse the models that ollama has available at https://ollama.com/search you’ll see there are three kinds: Embedding, Vision and Tools.

Depending on your GPU and memory I recommend installing one of the tools models, which you can use just like ChatGPT, except that it runs locally:

$ ollama run llama3.2 "Why is the sky blue"
The sky appears blue because of a phenomenon called Rayleigh scattering, named after the British physicist Lord...

However for this example we need an Embedding model. I used nomic-embed-text which I installed (once I’d first installed ollama) using:

ollama pull nomic-embed-text

Check that it is installed using their command line example

curl http://localhost:11434/api/embeddings -d '{
  "model": "nomic-embed-text",
  "prompt": "The sky is blue because of Rayleigh scattering"
}'

This will return an array of 768 floats:

{"embedding":[0.5898658037185669,0.4017767012119293,-3.305577278137207,...-0.1482105553150177,-0.5522851347923279,0.46912315487861633]}

I know there are 768 of them because that is the nomic-bert.embedding_length in the model description. This tells us how large to make the vector database in our code later.

Setting up sqlite to add a vector database extension

sqlite does not by itself support a vector database, so I’m going to add an extension to sqlite to support vectors: sqlite-vec

I’ve already covered how to load an extension in sqlite in a previous post.

If you go to the sqlite-vec releases you can download a dynamically loadable library (a DLL on Windows, a dylib on macOS etc), which you can then load into your existing sqlite-net session:

var databasePath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData), "Demo.db");

var sqliteFlags = SQLiteOpenFlags.ReadWrite | SQLiteOpenFlags.Create;
var connection = new SQLiteAsyncConnection(databasePath, sqliteFlags);

var extension = GetExtensionName();

await LoadExtension(connection, extension);
...

[DllImport("e_sqlite3", EntryPoint = "sqlite3_load_extension", CallingConvention = CallingConvention.Cdecl)]
public static extern Result LoadExtension(SafeHandle db, [MarshalAs(UnmanagedType.LPStr)] string filename, int entry, int msg);
private async Task LoadExtension(SQLiteAsyncConnection connection, string extension)
{
  await connection.EnableLoadExtensionAsync(true);

  var connectionWithLock = connection.GetConnection();
  using var theLock = connectionWithLock.Lock();
  var handle = connectionWithLock.Handle;
  var result = LoadExtension(handle, extension, 0, 0);
  if (result != Result.OK)
  {
    throw new Exception("Failed to load extension: " + result);
  }
}

For more explanation please see this post

Setting up the tables

We are going to have a standard sqlite table called Model that has some text in it - this is what we’ll return when we search using the embeddings:

class Model
{
  [PrimaryKey, AutoIncrement]
  public int Id { get; set; }
  public string Title { get; set; } = "";
}

We’ll want to create a table for both this model table, and the vector database we’ll be storing vectors in, and searching:

const int EmbeddingLength = 768;
...

await connection.CreateTableAsync<Model>();
await connection.ExecuteAsync($"CREATE VIRTUAL TABLE IF NOT EXISTS vec_items USING vec0(embedding float[{EmbeddingLength}]);");

The EmbeddingLength is the size of the array of floats that nomic returns, as defined in nomic-bert.embedding_length above.

Generating the embeddings (array of floats) for text

We’ll need a C# method that basically does the same thing as the curl command above:

public class EmbeddingResponse
{
  [JsonPropertyName("embedding")]
  public double[] Embedding { get; set; } = [];
}

private async Task<double[]> GenerateEmbedding(string text)
{
  var client = new HttpClient();
  var requestBody = new
  {
    model = "nomic-embed-text",
    prompt = text
  };

  var json = JsonSerializer.Serialize(requestBody);
  var content = new StringContent(json, Encoding.UTF8, "application/json");

  var response = await client.PostAsync("http://localhost:11434/api/embeddings", content);
  var responseStream = await response.Content.ReadAsStreamAsync();

  var result = await JsonSerializer.DeserializeAsync<EmbeddingResponse>(responseStream);
  return result?.Embedding ?? [];
}

Storing the embeddings (array of floats)

What’s a bit confusing about the sqlite-vec extension is that you need to store the array of floats as a json array:

 private async Task AddEmbedding(SQLiteAsyncConnection connection,  Model model, double[] embedding)
 {
   var sql = "INSERT INTO vec_items (rowid, embedding) VALUES (?, ?)";
   var json = JsonSerializer.Serialize(embedding);
   var parameters = new object[] { model.Id, json };
   await connection.ExecuteAsync(sql, parameters);
 }

I’m associating the vector with the id of the model object, so that I can map search results back to the model.

Searching the embeddings

To search for text, say “apple”, you first convert it to an embedding, just like you did for the text to be searched, and then try to match the embedding for “apple” with all the existing embeddings:

  private async Task<List<Model>> SearchForEmbedding(SQLiteAsyncConnection connection, double[] embedding, int k)
  {
    var json = JsonSerializer.Serialize(embedding);
    var parameters = new object[] { json, k };
    var sql = $"SELECT Model.* FROM vec_items LEFT JOIN Model ON Model.Id = vec_items.rowid WHERE embedding MATCH ? AND k=? order by distance";
    var result = await connection.QueryAsync<Model>(sql, parameters);
    return result;
  }

k is the number of results to return. I first tried using the limit clause but that didn’t work.

Putting it all together

Here is the complete program. I’ve also put it on GitHub here

using System.Runtime.InteropServices;
using System.Text;
using System.Text.Json.Serialization;
using System.Text.Json;
using SQLite;
using static SQLite.SQLite3;

namespace EmbeddingsExample.SqliteExtensions;

internal class Demo
{
  // nomic-bert.embedding_length from https://ollama.com/library/nomic-embed-text/blobs/970aa74c0a90
  const int EmbeddingLength = 768;

  const string LibraryPath = "e_sqlite3";


  internal async Task InitializeDatabase()
  {
    var databasePath = Path.Combine(Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData), "Demo.db");

    if(File.Exists(databasePath))
    {
      File.Delete(databasePath);
    }

    // open the database in read/write mode and create it if it doesn't exist
    var sqliteFlags = SQLiteOpenFlags.ReadWrite | SQLiteOpenFlags.Create;
    var connection = new SQLiteAsyncConnection(databasePath, sqliteFlags);

    var extension = GetExtensionName();

    await LoadExtension(connection, extension);

    await connection.CreateTableAsync<Model>();

    await connection.ExecuteAsync($"CREATE VIRTUAL TABLE IF NOT EXISTS vec_items USING vec0(embedding float[{EmbeddingLength}]);");


    // Newspaper headlines from the 1800s
    var headlines = new List<string>
    {
        "Gold Discovered in California!",
        "The Great Fire of London Bridge",
        "First Steam-Powered Train Debuts",
        "The Telegraph Revolutionizes Communication",
        "The First Telephone Call by Alexander Graham Bell",
        "The Irish Potato Famine Strikes",
        "Darwin Publishes 'On the Origin of Species'",
        "The American Civil War Begins",
        "Eiffel Tower Construction Announced",
        "The Light Bulb Invented by Edison"
    };


    // Create and insert models
    for (int i = 0; i < headlines.Count; i++)
    {
      var text = headlines[i];
      Console.WriteLine($"{i} Generating embedding for {text}");
      var model = new Model { Title = text };
      await connection.InsertAsync(model);
      var embedding = await GenerateEmbedding(text);
      await AddEmbedding(connection, model, embedding);
    }

    // Now search
    var searchTerm = "disaster";
    Console.WriteLine($"Searching for \"{searchTerm}\" ...");
    var searchEmbedding = await GenerateEmbedding(searchTerm);
    var results = await SearchForEmbedding(connection, searchEmbedding, k: 3);
    foreach (var result in results)
    {
      Console.WriteLine($"Found: {result.Title}");
    }
  }

  private string GetExtensionName()
  {
    // Could also look at RuntimeInformation.ProcessArchitecture;
    if (RuntimeInformation.IsOSPlatform(OSPlatform.Windows))
    {
      return "SqliteExtensions/winx86/vec0";
    }
    else if (RuntimeInformation.IsOSPlatform(OSPlatform.OSX))
    {
      return "SqliteExtensions/macosarm/vec0";
    }
    else
    {
      throw new NotSupportedException("Unsupported platform");
    }
  }


  [DllImport(LibraryPath, EntryPoint = "sqlite3_load_extension", CallingConvention = CallingConvention.Cdecl)]
  public static extern Result LoadExtension(SafeHandle db, [MarshalAs(UnmanagedType.LPStr)] string filename, int entry, int msg);
  private async Task LoadExtension(SQLiteAsyncConnection connection, string extension)
  {
    await connection.EnableLoadExtensionAsync(true);

    var connectionWithLock = connection.GetConnection();
    using var theLock = connectionWithLock.Lock();
    var handle = connectionWithLock.Handle;
    var result = LoadExtension(handle, extension, 0, 0);
    if (result != Result.OK)
    {
      throw new Exception("Failed to load extension: " + result);
    }
  }

  private async Task<double[]> GenerateEmbedding(string text)
  {
    var client = new HttpClient();
    var requestBody = new
    {
      model = "nomic-embed-text",
      prompt = text
    };

    var json = JsonSerializer.Serialize(requestBody);
    var content = new StringContent(json, Encoding.UTF8, "application/json");

    var response = await client.PostAsync("http://localhost:11434/api/embeddings", content);
    var responseStream = await response.Content.ReadAsStreamAsync();

    var result = await JsonSerializer.DeserializeAsync<EmbeddingResponse>(responseStream);
    return result?.Embedding ?? [];
  }

  private async Task AddEmbedding(SQLiteAsyncConnection connection,  Model model, double[] embedding)
  {
    var sql = "INSERT INTO vec_items (rowid, embedding) VALUES (?, ?)";
    var json = JsonSerializer.Serialize(embedding);
    var parameters = new object[] { model.Id, json };
    await connection.ExecuteAsync(sql, parameters);
  }

  private async Task<List<Model>> SearchForEmbedding(SQLiteAsyncConnection connection, double[] embedding, int k)
  {
    var json = JsonSerializer.Serialize(embedding);
    var sql = $"SELECT Model.* FROM vec_items LEFT JOIN Model ON Model.Id = vec_items.rowid  WHERE embedding MATCH ? AND k=? order by distance";
    var parameters = new object[] { json, k };
    var result = await connection.QueryAsync<Model>(sql, parameters);
    return result;
  }

  public class EmbeddingResponse
  {
    [JsonPropertyName("embedding")]
    public double[] Embedding { get; set; } = [];
  }




  class Model
  {
    [PrimaryKey, AutoIncrement]
    public int Id { get; set; }

    public string Title { get; set; } = "";

  }
}

What’s the point?

What’s the point … why not just do a normal search? Well for me the advantage is that I can search for concepts … note that “disaster” isn’t in any of the headlines being searched for, but conceptually it is related to the first two results.

There are libraries such as Microsoft’s SQLIte Vector Store connector and I recommend taking a look. I imagine that sqlite-vec will soon support .NET out of the box.

That said, in the same way that it is important that you understand the fundamentals before letting a LLM speed up your coding, sometimes its good to take a look under the hood and understand things at a lower level. LLMs are not going away, and even if we are being saturated with articles about them, there is something there - best get aboard the train!.

Kind comments and suggestions are welcome