A project from the past: vector store in the browser

March 30, 2024 (6mo ago)

As every other story, this one starts out with a curiosity. I was wondering if it was possible to build a vector store in the browser. I had no idea what I was getting myself into, but I was excited to find out. Like many tales of invention and curiosity, this journey begins with a simple question: Can a vector store thrive within the confines of a browser? This query wasn't born in isolation; it was partly inspired by the musings of Paul Kinlan on the potential of browser-based vector databases for handling complex data structures, akin to services like Polymath and Pinecone but localized within the user's own digital space.

Understanding Vector Stores

In simple terms, a vector store is like a special shelf in your digital library, designed to hold and manage vectors. Imagine a vector as an arrow pointing in a specific direction with a certain strength; mathematically, it's a series of numbers. Our project transforms your browser into a librarian that can efficiently store, find, and organize these arrows based on their properties.

Why the Browser?

The idea to plant a vector store right in the user’s web browser intrigued me for a few reasons. It promised a hands-on learning experience, a chance to wrestle with new technologies, and, importantly, the opportunity to create something genuinely useful.

How it works

At the heart of this project is IndexedDB, a tool provided by browsers for storing structured data like our vectors. This choice allows us to keep all data on your side, eliminating unnecessary server calls, which means better performance and more privacy for you.

I introduced two clever strategies to work with vectors: Locality-Sensitive Hashing (LSH) and magnitude filtering. LSH helps us find similar vectors quickly by grouping them together, while magnitude filtering sorts vectors based on their "strength," ensuring we only deal with vectors that meet our criteria, in both cases they are sorted through a max heap data structure based on the cosine similarity.

Looking Ahead

To make this more accessible, I've also developed a React hook, use-vector-store. This tool simplifies interactions with the vector store, hiding the complexities of IndexedDB behind a straightforward API. This way, developers can easily store, retrieve, and manage vectors in their projects

To see this vector store in action, check out a demo here. For those interested in the nuts and bolts, the source code is up for grabs on GitHub.