Vector search and AI rush

it’s all over the internet. with AI and Generative AI taking over most of the tech news, “Vector Search” is hard to ignore.

tl;dr - Vector search and the AI connection

The noise around AI, Generative AI applications and the demand these apps are creating around the need for users to search through huge data models. To put tin simple terms, Generative AI models use neural networks to identify the patterns and structures within existing data to generate new and original content.

So the more people are using Generative AI.. which is more or less “Searcing through tons of content and getting appropriate results back, and that too fast.” If you have used ChatGPT, Google Bard, Github’s Copilot, all of these give you answers/generate responses based on your questions/requests.

All of this boils down to “search within a content and get results back” aka “search”.

This post details about vector search, how it works (briefly) and some platforms that support this.

What is vector search?

Vector search leverages machine learning (ML) to capture the meaning and context of unstructured data, including text and images, transforming it into a numeric representation. Frequently used for semantic search, vector search finds similar data using approximate nearest neighbor (ANN) algorithms. Compared to traditional keyword search, vector search yields more relevant results and executes faster - Source Elasticsearch.

..wait that’s a lot. Vectors? Semantic search? Nearest neighbor algorithm?

here’s a simpler version - Vector search is a type of search that uses machine learning to find objects that are similar to a given query. It works by converting the query and the objects into vectors, which are then compared to each other. The objects that are most similar to the query are then returned as results.

What is a vector?

In computer science a Vector is a data structure that stores a sequence of elements of same data type. They are similar to arrays but they have some differences:

  • Random access the elements in a vector can be accessed by their index making them efficient for operations like searching and sorting.

  • Dynamic size while arrays have fixed size, vectors can grow & shrink as needed making them more flexible and efficient.

  • Vector operations vectors support a lot of operations that are optimized for their data structure like addition, subtraction, multiplication, and division.

Why search?

A common use case in almost every software system, search enables users to find information within the system or data that the system can access.

But what is the problem? Humans use language which is often ambiguous and fuzzy. There are synonyms (two words mean the same thing), polysems (same word can have multiple meanings). In English for example “accurate” and “precise” can sometimes be synonymous, but “accurate” can also mean many different things like current, good, true, proper. With these ambiguities, developing efficient search is difficult.

How does a vector search engine work?

Traditional search leans more on keywords, similar words, how many time a word repeated, vector search engines use distances in the embedding space to represent similarity.

To address the above, different machine learning techniques such as spelling correction, language processing, category matching, and more are used to structure and make sense of language. These words are converted into vectors (numbers) allowing their meaning to be encoded and processed mathematically. This process of converting is called vectorization.

Usually vectors are used for clustering documents, identifying meaning and intents in queries, ranking results and adding synonyms. There is also something called vector embeddings which are more than vectors + objects like whole documents, audio, video, images files.

Image by Elasticsearch

Embeddings - aka vectors for business use cases

A major challenge in creating vectors for representing different entities that are meaningful and useful for business use cases. By applying pre-trained deep learning technology models to raw data, you can extract "embeddings" - vectors that map each row of data in a space of their "meanings".

Challenge 1 - Successful extraction of useful vectors (embeddings) from your business data is the key challenge. But after you have done that, the only thing you have to do is search for similar vectors - which is another complex part of the process.

Challenge 2 - Building a fast and scalable vector search engine. Some of the most widely used metrics for calculating the similarity between vectors are L2 distance (Euclidean distance), cosine similarity, and inner product (dot product). These may get too mathematical, and can conveniently avoid understanding the inner working of this. Usually search engines will have these pre-defined/implemented.

To learn more about creating embeddings, this google course will help to understand the fundamentals.

Use cases

Vector search is specialized search and like many software systems, it is not recommended for everything and/or can solve all search problems.

Few use cases:

  1. Semantic Search — wait what the heck is this? if you asked above.. please stop there. Read about Semantic Search here.

  2. Recommendations - Vector search can be used to recommend products to users based on their past purchases, interests, and browsing history.

  3. Question Anwsering -

  4. Image search - Vector search can be used to find images that are similar to a given image. This is useful for finding similar products, finding similar artwork, or finding similar landmarks.

  5. Natural language search - Vector search can be used to find documents that are similar to a given document. This is useful for finding similar news articles, finding similar research papers, or finding similar blog posts.

A selection of software products with vector search

Onward

Generative AI is exploding eery single day and as more enterprises & startups jump into the race, more applications are built, the demand for “search” will only grow north. This directly relates to how fast users will be able to view “search results” from the terabytes of data these organizations and LLaMA models are released.

Next
Next

Do we all need to learn Platform Engineering now?