Featured Vector Tech Topic: choosing a database for generative AI applications
When choosing a database for generative AI applications, the article “Key considerations when choosing a database for your generative AI applications” mentions:
Familiarity: Choosing a database that your team is already familiar with can significantly reduce the learning curve and speed up the development process. This familiarity ensures that you can leverage existing skills and knowledge, leading to more efficient and effective implementation.
Ease of Implementation: The database should integrate seamlessly with your existing infrastructure, focusing on aspects such as vectorization, management, access control, compliance, and user interface. This ease of implementation helps in minimizing disruptions and facilitates a smoother transition and deployment process.
Scalability: It’s essential to select a database that can handle high-dimensional vectors and support substantial data growth. Scalability ensures that your system can expand and accommodate increasing data loads without compromising performance, which is crucial for the evolving needs of generative AI applications.
Performance: Evaluate the following metrics that cover Machine Learning performance and database performance:
- Throughput: Number of queries processed per second.
- Recall: Relevance and completeness of retrieved vectors, providing accurate responses.
- Index build time: Duration required to build the vector index.
- Scale/cost: Ability to efficiently scale to billions of vectors while remaining cost-effective.
- P99 latency: Maximum latency for 99% of requests, meeting response time expectations.
- Storage utilized: How efficiently storage of high-dimensional vectors is used, which is particularly important for high-dimensional vectors.
The article covers several vector storages that are available within AWS. Familiar databases are e.g., Amazon Aurora PostgreSQL or Amazon Relational Database Service (RDS) for PostgreSQL. But databases can use the PostgreSQL pgvector extension. NoSQL databases have also been enriched with vector functionalities like Amazon Neptune Analytics (Graph model), Amazon DocumentDB (Document model), Amazon MemoryDB (Key Value model), or Amazon OpenSearch (Search model).
My take
Selecting familiar technologies is always preferable to extend know-how, integrate into existing infrastructure (monitoring, logging, backup, etc.), foster standardization, and especially professionalization. Many well-known RDBMS and NoSQL DBs already enriched their features by adding vector functionalities (PostgreSQL, Oracle, etc). More vendors will also jump on the bandwagon soon.
The article also mentions performance which is manifold. Performance can range from Machine Learning performance like recall to database performance like throughput or latency. KPIs such as recall and latency influence each other – it is necessary to understand the interplay. It is therefore not sufficient to be good at improving query response time by creating indexes as in classical database development. Those (B-tree) indexes can improve performance but don’t change the result. It’s totally different with vector indexes: it is necessary to understand the business, too, as queries using vector indexes (IVF, HNSW, etc.) will not be exact but approximate: those vector indexes can change the result.
Additional Vector Tech 07/2024 resources
Summary of some articles that caught my attention.
BM42: New Baseline for Hybrid Search
BM25 is the standard algorithm for search engines. Vector database Qdrant now offers BM42 as an experimental approach for shorter texts.
Source: BM42: New Baseline for Hybrid Search by Andrey Vasnetsov, 01-JUL-2024
Getting started with vectors in 23ai
First steps with Oracle AI vector to create tables, load embedding models from ONNX (Open neural network exchange) files and query vector data. An interesting feature is the possibility to load embedding models from ONNX files into the database and use the model to create the embeddings. The embeddings are directly calculated in the database which avoids network traffic but requires to have sufficient CPUs on the database server.
Source: Getting started with vectors in 23ai by Ulrike Schwinn and Stephane Duprat, 16-JUL-2024
and Now Available! Pre-built Embedding Generation model for Oracle Database 23ai by Sherry LaMonica, 17-JUL-2024
Advanced RAG techniques
Advanced techniques for improving RAG are illustrated in the Weaviate blog post:
- Indexing: Methods such as semantic and language model-based chunking improve data storage and retrieval.
- Retrieval: Techniques like hybrid search and query rewriting enhance the recall and relevance of retrieved information.
- Generation: Methods such as autocut, reranking, and fine-tuning LLMs improve the quality and relevance of generated responses by ensuring optimal context is provided to the language models.
Source: Advanced RAG Techniques by Zain Hasan, 25-JUL-2024
Gartner Hype Cycle for DataManagement 2024
The hype cycle mentioned in the heading was released in July 2024 with vector databases on their way to the peak of inflated expectations. Gartner expects 5-10 years until the plateau is reached.
BTW, Data Mesh is also mentioned and expected to be obsolete before the plateau is reached as already covered in the 2023 hype cycle. Gartner prefers Data Fabric instead.
Source: Hype Cycle for Data Management, 2024 and compare to Hype Cycle for Data Management, 2023
Stack Overflow 2024 developer survey
Stack Overflow published their 2024 developer survey with several categories. While vector tech like vector databases or RAG are not directly mentioned, there is still interesting data around database, AI or GenAI, e.g.,
- Most admired and desired database: PostgreSQL
- Most developers agree that AI tools will be more integrated mostly in the ways they are documenting code (81%), testing code (80%), and writing code (76%)
- Circulating misinformation and disinformation are the top ethical concern
Source: 2024 Developer survey by Stack Overflow
Looking Ahead: Vector tech conferences or events
A selection of conferences or events containing vector tech sessions:
- Big Data Conference Europe: AI, Cloud and Data Conference, 19-NOV-2024 until 22-NOV-2024, Vilnius and online
- DOAG K&A, 19-NOV-2024 until 22-NOV-2024, Nuremberg
- KI Navigator, 20-NOV-2024 until 21-NOV-2024, Nuremberg
For more articles around Vector Tech see on my blog.