Do dedicated generative AI software developers exist? Data platform company Redis thinks so. The company this month came forward with the difficult-to-pronounce LangCache, a managed semantic caching service for AI apps, agents and vector sets. The technology comes with a new native data type for Redis that allows developers to access and work with vectors and use them in more composable and scalable ways. This is complex technology – and it is positioned as a way of offering a comprehensive data architecture for developers to build generative AI applications and agents – so what does it all mean?
Let’s take the components here one by one. A managed semantic caching service offers a way to intelligently store and retrieve information based on the meaning or context of a query, rather than offering any form of data result in more “exact match” terms. As database purists will know, vector sets are a data type designed to store collections of vectors with optional attributes i.e. sets are similar to “sorted sets”, but use string representations (a method for storing characters or numbers in computer memory) of vectors instead of scores.
Cutting costly calls
The company says that LangCache allows developers to integrate Redis-based LLM response caching into applications. The sum total of this data architecture is said to reduces costly calls to LLMs, storing and reusing prompts and responses to minimise cost, improve prompt accuracy and deliver faster AI.
Redis also introduced vector sets, a new native data type in Redis. Vector sets allow developers to access and work with vectors and use them in more composable and scalable ways. Vector sets complement Redis’ existing vector similarity search, offering a lower-level way to work with vectors.
LangCache lets developers improve the accuracy of LLM cache retrieval using a custom fine-tuned model and configurable search criteria, including search algorithm and threshold distance. Software engineers can generate embeddings through their preferred model provider, eliminating the need to separately manage models, API keys and model-specific variables.
“Vector sets take inspiration from sorted sets, one of Redis’s fundamental data types known for its efficiency in handling ordered collections. The new data type extends this concept by allowing the storage and querying of high-dimensional vector embeddings, which are crucial for various AI and machine learning applications,” noted Redis, in a technical briefing statement.
Quantization quotient
Vector sets also implement some additional capabilities, including quantization, the process of converting continuous values into a smaller set of discrete values. Fundamentally, this process works by “rounding” real-world values to digital representations so that a data flow can approximate a continuous range of values with a finite number of discrete levels. In a vector set, the vectors are quantized by default to 8-bit values. However, this can be modified to no quantization or binary quantization when adding the first element.
Dimensionality reduction is also in the mix here i.e. the number of dimensions in a vector can be reduced by random projection by specifying the option and the number of dimensions. In filtering, each element of the vector set can be associated with a set of attributes specified as a JSON blob via the VADD or VSETATTR command. This allows the ability to filter for a subset of elements using VSIM that are verified by the expression.
Other new tools and features for AI developers include Redis Agent Memory Server, an open source service that provides memory management for AI apps and agents. Users can manage short-term and long-term memory for AI conversations, with features like automatic topic extraction, entity recognition and context summarisation.
Agent architectures & agentic apps
A portfolio of native integrations exists for LangGraph, specifically designed for agent architectures and agentic apps. Use Redis to build a LangGraph agent’s short-term memory via checkpointers, long-term memory via Store, vector database, LLM cache and rate limiting.
“Generative AI requires a wide array of data types, so developers need a platform that can handle it all fast, at scale, multi-cloud or hybrid. New features in Redis Cloud ensure devs can easily build and deliver real-time generative AI apps faster while optimising the total cost of ownership,” noted Redis.
Developers can now view, update, query and search the data in Redis directly from their browser. Redis Insight gives access to the Redis developer environment, including the “workbench” and tutorials and new query autocompletion which pulls in and suggests schema, index and key names from Redis data in real-time to allow developers to write queries faster and easier.