Building AI Applications - What Will Your Development Process be? - JOYK Joy of Geek, Geek News, Link all geek

Generative AI systems depend on data. For Generative AI systems, this will frequently involve vector databases. Vector databases are specifically designed to handle data for Generative AI systems and make it easier to support the kinds of search patterns that your application will need. So what are they, and how can you pick the right approach?

Why LLMs need vector data

Large language models (LLMs) translate natural language requests into sets of data that machines can understand, and then translate those results back into natural language responses for the user. The LLM uses the data that it was trained on to complete these responses. However, the LLM does not update itself with any new information over time. This leads to Generative AI systems quoting references that do not exist, creating fake information, or mis-stating otherwise truthful information. These are commonly termed ‘hallucinations.’

To deal with this, we can bring in more data for the LLM to use. For optimal performance, this data is commonly stored in a vector database, where data is encoded with mathematical values and available for search. These vectors provide more specific information on any terms and their semantic meaning. In practice, vector data provides more precise information on the semantic context where terms should be used next to each other, as compared to traditional text search tools.

To get started, you will have to turn your existing data into a set of vectors. Depending on your data format, there are multiple tools and libraries available that can cover the translation stage. Projects like Word2vec and BERT (Bidirectional Encoder Representations from Transformers) by Google Research can translate words to vector values using a neural network approach.

Implementing a vector database

Once you have vectorised your data, you then have to store it in a vector database. Creating your own vector database makes the search and data return process much more efficient, as it can include more data on specific topics and areas that are specific to your organisation.

Once you have your data as vectors you can then use mathematical techniques to search in very large datasets for similar results. Algorithms commonly used here deliver Approximate Nearest Neighbour (ANN) results, which should be similar to your original request. The most common algorithm used is Hierarchical Network Small World (HNSW), which is a search that goes through connections between data values to find similar results. For vector data, finding results using HNSW points to other items that have the same or similar semantic properties. These results are then fed back into the LLM and provided back to the user.

API Gateways in Action

In this Fullstack Live Event, Daniel Murrmann shows developers how API gateways can open up a path into the microservices cosmos. Topics including data aggregation, caching, authentication, and much more will be discussed with hands-on examples.

This can be very useful when you want to search and find multiple items that are similar to the original request, based on what the user asked for. Using vector similarity searches, you can focus on the similarity that exists between different items based on their semantic characteristics, rather than relying solely on keywords used. Using a vector database allows you to bring in your own data for searches, which will be faster and more specific than carrying out a search against an LLM alone. These results do have to be provided to the LLM to be synthesised into a user response, which has to be factored into response timing, but the quality of the response should be much more accurate.

Implementing a vector database involves looking at the database that you will use, where you will run that instance, and how it will connect to the LLM. There are multiple vector database options available on the market, so your choice will depend on how much vector data you have to support and how high you will expect to scale. This is hard to estimate, so consider a serverless database approach where you only pay for what you actually consume rather than overestimating your needs.

From a scale perspective, the open source database Apache Cassandra provides scalability and performance that can match the most demanding applications. The likes of Uber use Cassandra in their machine learning and predictive AI deployments, so it can support very large ML and AI workloads. Cassandra 5.0 introduces support for vector data workloads as well. This will suit those developers that want to reduce the number of database platforms that they have in place.

Integrating everything together

Alongside choosing your LLM and preparing your data as vectors, consider how your users will interact with your service and how you will return the results to them. Will you provide a chatbot service, or will you support other methods like image search? Will you support other formats for data, such as sending PDF reports or other kinds of documents?

Considering the entire enterprise, you will have to integrate multiple tools together that can support those processes and work across multiple data sources and even LLMs. This integration work will connect up your Generative AI system so that you can fulfil your application functionality, but it leads to more maintenance and management overheads for all those connections over time. As components are updated or get more functionality added, you may end up looking at more and more integration work to keep things running.

To support this more effectively, you can use an abstraction layer for these integrations, employing frameworks and gateways that will manage these areas on your behalf. For example, services like LangChain fill the gaps that exist between LLMs and data sources, as well as managing integration with different output tools like chat services, PDF creation tools and other software components. Rather than integrating these services together yourself, LangChain can plug those components in as needed.

Alongside services like LangChain, there are other elements that you will have to connect together. For Apache Cassandra, the open source project CassIO makes Generative AI deployment and integrations easier as it provides a set of standardised facilities to interact with Cassandra through typical patterns needed by Machine Learning (ML) and LLM applications. This includes support for injecting data from a feature store like the open source Feast, data pipelines for batch and stream processing, and tools for data Extract-Transform-Load operations. All of these tools can play specific roles within your Generative AI service, but integrating and managing them all is a potential headache that you can avoid.

Developing Generative AI services in practice

Whatever your Generative AI requirements, you will have to integrate multiple different services and components together. This leads to ongoing overheads in terms of integration over time. The open source community is putting together tools that can make this easier to manage, so investigate what is available and where you can take advantage of these projects. If you can, be sure to contribute to those projects and support them where possible.

Vector databases are at the heart of Generative AI. Without them, it is expensive and difficult to add the data that your users want to see as part of their responses. Using your data as vectors, you can improve the results that your application delivers, and retain control over that data as well.

Dom Couldwell

Dom Couldwell is Head of Field Engineering EMEA at DataStax, a real-time data and AI company. Dom helps companies to implement real-time applications based on an open source stack that just works. His previous work includes more than two decades of experience across a variety of verticals including Financial Services, Healthcare and Retail. Prior to DataStax, he has previously worked for the likes of Google, Apigee and Deutsche Bank.

Building AI Applications - What Will Your Development Process be?

Why LLMs need vector data

Implementing a vector database

API Gateways in Action

Integrating everything together

Developing Generative AI services in practice

Recommend

OPPO Reno11系列发布全系配单反级人像镜头 2499起

MEV（七）：更公平的 MEV 生態系（下）

Optimizing System Performance by Implementing a Dual Database Setup in Rails

Temu们围攻亚马逊：一场或被低估的错位竞争

Today on The Vergecast: Cybertruck details, billionaire drama, and digital gods.

早报：华为引领中国折叠屏市场荣耀90 GT外观曝光

Women in Tech: "Tech underpins all aspects of life today".

Adding Music Previews to My Now Page

Sessions

CSS 实现弧形卡片的 3 种方式

About Joyk