Table of contents
Introduction
Artificial intelligence π€
has evolved at an unprecedented rate , no one could've imagined. AI is bringing a revolution in all fields and changing how things work. To support this, efficient data storing π and processing has become crucial to reduce the runtime β for such applications that use semantic searching
π and other high computation expenses π².
Vector Databases ππ
Vector Databases were introduced in the late 2010s. Vector databases, also known as vectorized databases, are a novel type of database system that leverages vectorized data storage π and processing techniques β. Unlike traditional relational databases that store data in rows and columns, vector databases focus on storing data as high-dimensional vectors
πͺ. These vectors are representations of the data points in a multi-dimensional space, where each dimension corresponds to a particular attribute π or feature of the data. Depending on the complexity and the granularity of the data, the dimensions can be hyper-tuned to minimize feature loss and retain most of the data efficiently. These vectors are typically generated by applying some kind of transformation
or embeddings
by ML π€ models, algorithms etc. The main advantage of a vector database is that it allows for fast and accurate similarity search
π and retrieval of data based on their vector distance or similarity. This means that instead of using traditional methods of querying β databases based on exact matches or predefined criteria, you can use a vector database to find the most similar or relevant data based on their semantic or contextual
meaning.
Weaviate π
Weaviate is an open source
vector database. It stores data in the form of a vector of objects which helps in lightning-fast search of that object or a semantically similar object.
Some features that Weaviate boasts :
π Weaviate allows you to store and retrieve data objects based on their semantic properties by indexing them with vectors.
πWeaviate can be used stand-alone (aka bring your vectors) or with a variety of modules that can do the vectorization for you and extend the core capabilities.
πWeaviate has a GraphQL-API to access your data easily. This makes the retrieval fast and efficient as you only query what you want.
πWeaviate is fast (check out their open-source benchmarks).
Weaviate is very easy to use, due to its comprehensive documentation and functions that are go-to-use.
For example :
import weaviate from 'weaviate-ts-client';
// ts-js library for weaviate
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.schema
.getter()
.do();
This piece of code fetches the schemas of different classes in the database. In case we just wanted to fetch the schema of a particular class in the database, with just a few changes the code will look like this:
import weaviate from 'weaviate-ts-client';
// ts-js library for weaviate
const classname = 'your-classname'
const client = weaviate.client({
scheme: 'http',
host: 'localhost:8080',
});
const response = await client.schema
.classGetter()
.withClassName(classname)
.do()
Weaviate also provides its cloud service to new users for free on a trial basis for 14 days. You can use their inbuilt sandbox query or get the API key and use it locally for third-party applications. As Weaviate is an open-source tool, they welcome anyone who is interested to contribute to their codebase π©βπ». Be sure to check out this wonderful tool in your projects !!