Quantcast
Channel: Data Management & Data Architecture
Viewing all articles
Browse latest Browse all 22

Vector Vanguard: Tracking the Pulse of Vector Tech 07/2024

$
0
0
Welcome to “Vector Vanguard: Tracking the Pulse of Vector Tech 07/2024” – a source for the latest developments in vector databases, vector indexes, RAG (Retrieval-Augmented Generation), similarity search, and related technologies that caught my attention in the last month.

Featured Vector Tech Topic: choosing a database for generative AI applications

When choosing a database for generative AI applications, the article “Key considerations when choosing a database for your generative AI applications” mentions:

Familiarity: Choosing a database that your team is already familiar with can significantly reduce the learning curve and speed up the development process. This familiarity ensures that you can leverage existing skills and knowledge, leading to more efficient and effective implementation.

Ease of Implementation: The database should integrate seamlessly with your existing infrastructure, focusing on aspects such as vectorization, management, access control, compliance, and user interface. This ease of implementation helps in minimizing disruptions and facilitates a smoother transition and deployment process.

Scalability: It’s essential to select a database that can handle high-dimensional vectors and support substantial data growth. Scalability ensures that your system can expand and accommodate increasing data loads without compromising performance, which is crucial for the evolving needs of generative AI applications.

Performance: Evaluate the following metrics that cover Machine Learning performance and database performance:

  • Throughput: Number of queries processed per second.
  • Recall: Relevance and completeness of retrieved vectors, providing accurate responses.
  • Index build time: Duration required to build the vector index.
  • Scale/cost: Ability to efficiently scale to billions of vectors while remaining cost-effective.
  • P99 latency: Maximum latency for 99% of requests, meeting response time expectations.
  • Storage utilized: How efficiently storage of high-dimensional vectors is used, which is particularly important for high-dimensional vectors.

The article covers several vector storages that are available within AWS. Familiar databases are e.g., Amazon Aurora PostgreSQL or Amazon Relational Database Service (RDS) for PostgreSQL. But databases can use the PostgreSQL pgvector extension. NoSQL databases have also been enriched with vector functionalities like Amazon Neptune Analytics (Graph model), Amazon DocumentDB (Document model), Amazon MemoryDB (Key Value model), or Amazon OpenSearch (Search model).

My take

Selecting familiar technologies is always preferable to extend know-how, integrate into existing infrastructure (monitoring, logging, backup, etc.), foster standardization, and especially professionalization. Many well-known RDBMS and NoSQL DBs already enriched their features by adding vector functionalities (PostgreSQL, Oracle, etc). More vendors will also jump on the bandwagon soon.

The article also mentions performance which is manifold. Performance can range from Machine Learning performance like recall to database performance like throughput or latency. KPIs such as recall and latency influence each other – it is necessary to understand the interplay. It is therefore not sufficient to be good at improving query response time by creating indexes as in classical database development. Those (B-tree) indexes can improve performance but don’t change the result. It’s totally different with vector indexes: it is necessary to understand the business, too, as queries using vector indexes (IVF, HNSW, etc.) will not be exact but approximate: those vector indexes can change the result.

Additional Vector Tech 07/2024 resources

Summary of some articles that caught my attention.

BM42: New Baseline for Hybrid Search

BM25 is the standard algorithm for search engines. Vector database Qdrant now offers BM42 as an experimental approach for shorter texts.

Source: BM42: New Baseline for Hybrid Search by Andrey Vasnetsov, 01-JUL-2024

Getting started with vectors in 23ai

First steps with Oracle AI vector to create tables, load embedding models from ONNX (Open neural network exchange) files and query vector data. An interesting feature is the possibility to load embedding models from ONNX files into the database and use the model to create the embeddings. The embeddings are directly calculated in the database which avoids network traffic but requires to have sufficient CPUs on the database server.

Source: Getting started with vectors in 23ai by Ulrike Schwinn and Stephane Duprat, 16-JUL-2024
and Now Available! Pre-built Embedding Generation model for Oracle Database 23ai by Sherry LaMonica, 17-JUL-2024

Advanced RAG techniques

Advanced techniques for improving RAG are illustrated in the Weaviate blog post:

  • Indexing: Methods such as semantic and language model-based chunking improve data storage and retrieval.
  • Retrieval: Techniques like hybrid search and query rewriting enhance the recall and relevance of retrieved information.
  • Generation: Methods such as autocut, reranking, and fine-tuning LLMs improve the quality and relevance of generated responses by ensuring optimal context is provided to the language models.

Source: Advanced RAG Techniques by Zain Hasan, 25-JUL-2024

Gartner Hype Cycle for DataManagement 2024

The hype cycle mentioned in the heading was released in July 2024 with vector databases on their way to the peak of inflated expectations. Gartner expects 5-10 years until the plateau is reached.

BTW, Data Mesh is also mentioned and expected to be obsolete before the plateau is reached as already covered in the 2023 hype cycle. Gartner prefers Data Fabric instead.

Source: Hype Cycle for Data Management, 2024 and compare to Hype Cycle for Data Management, 2023

Stack Overflow 2024 developer survey

Stack Overflow published their 2024 developer survey with several categories. While vector tech like vector databases or RAG are not directly mentioned, there is still interesting data around database, AI or GenAI, e.g.,

  • Most admired and desired database: PostgreSQL
  • Most developers agree that AI tools will be more integrated mostly in the ways they are documenting code (81%), testing code (80%), and writing code (76%)
  • Circulating misinformation and disinformation are the top ethical concern

Source: 2024 Developer survey by Stack Overflow

Looking Ahead: Vector tech conferences or events

A selection of conferences or events containing vector tech sessions:

For more articles around Vector Tech see on my blog.


Viewing all articles
Browse latest Browse all 22

Trending Articles