LLM Applications: Delving into LangChain's Modules and Use Cases
May 22, 2023
The release of AI chatbots like ChatGPT has introduced large language models (LLMs) as a revolutionary technology, enabling developers to construct applications leveraging pre-trained LLMs. However, utilizing these LLMs in isolation often falls short of creating an app with true prowess. The real potential lies in combining them with other computational resources and knowledge.
This is where LangChain comes in – a framework designed to facilitate the development of LLM-powered applications by establishing connections between AI models and diverse data sources, enabling customization of natural language processing (NLP) solutions.
In this article, we will discuss how LangChain works and delve into its modules and use case
What is LangChain?
LangChain is a software development framework that AI companies can use to build applications powered by LLMs. The framework is developed around two core principles which include:
- Building data-aware applications that can establish connections between a language model and various data sources.
- Employing agentic abilities in applications that empower a language model to actively engage and interact with its surroundings.
The LangChain framework offers two key value prepositions to help businesses build their LLM-powered applications for a variety of use cases.
1. Modular Components:
LangChain offers modular abstractions for essential language model components along with comprehensive implementations. These user-friendly components can be utilized independently from the rest of the LangChain framework.
2. Use-Case Specific Chains:
Chains in LangChain are configurations that optimize the assembly of components for specific use cases. They serve as a user-friendly interface to facilitate the easy initiation of specific use cases. AI companies can customize these chains for personalized adaptations.
7 Major LangChain Use Cases for Businesses
LangChain is data-aware, i.e., connects language models with diverse data sources, and agentic, i.e., models can interact with their environment. It enables companies to use the LangChain framework by leveraging its modules in a variety of ways to accomplish different use cases for LLM applications. Let’s explore some of them below.
1. Question Answering Over Specific Documents
Companies can use the LangChain framework to build LLM-powered question-answering applications that can even process documents it hasn't been trained on.
LangChain provides a “retrieval augmented generation” process to find relevant documents and convert them into a queryable format given an input question. Then it passes the documents, along with the original question, to the language model to generate an answer.
Businesses can benefit from this use case in the following manner:
- Generate answers to questions based on the multiple document content.
- Cite document sources for the generated responses.
- Retrieve answers from a vector database such as Chroma.
- Semantically search and ask questions over a group chat.
2. Building Chatbots
Remember ChatGPT, the powerful language model from OpenAI that became a tech phenom overnight last year. Langchain can help companies build chatbots by leveraging language models due to their ability to generate high-quality conversational text.
Using LangChain, companies can develop chatbots using three main components:
- A language model to generate responses
- A PromptTemplate to give some character to the chatbot, i.e., make it helpful, funny, or informative.
- And a memory to retain previous conversations
When integrated with additional data sources, chatbots can become more powerful and generate knowledgeable responses.
By leveraging LangChain, businesses can build chatbots for different use cases, such as creating:
1. A ChatGPT clone: Users can utilize the following LangChain libraries to create a ChatGPT chain: OpenAI, ConversationChain, LLMChain, and PromptTemplate.
2. A ChatGPT-modified voice assistant that leverages speech_recognition and pyttsx3 libraries for speech-to-text and text-to-speech conversion, respectively.
3. Conversational agents that can engage in chat-like interactions with the users. LangChain offers a specific agent named conversational-react-description that is optimized for conversation. It must be used in combination with a LangChain memory component.
3. Text Summarization
Another LangChain use case is text summarization. It involves splitting long-form documents into smaller chunks and summarizing them recursively, i.e., summarize each chunk and then summarize each summary until only one summarized chunk is left. Users can use three different chain types: refine, stuff, and map_reduce, to achieve text summarization using LangChain.
Businesses looking to summarize long documents for different purposes can benefit from text summarization in the following ways:
1. Gather summarized information from a large volume of documents, internal and external.
2. Summarize lengthy reports, research papers, newsletters, financial documents, technical documents, or emails to improve productivity.
3. Analyze tons of customer feedback, reviews, and surveys to make data-driven business decisions.
4. Extracting Structured Information
Companies can use LangChain to extract structured information from unstructured text. This is vital since most companies use APIs and databases that only work with structured data.
Using LangChain, they can use extraction for the following use cases:
1. Converting a sentence into a structured row suitable for database insertion.
2. Transforming a lengthy document into multiple rows for database insertion.
3. Identifying and extracting the accurate API parameters from a user query.
In LangChain, output parsers play a crucial role in defining the desired response format and converting the raw-text output of a language model into structured information.
Companies can use different types of output parsers to generate desired outputs that are suitable for different use cases.
Some of these LangChain output parsers include.
1. CommaSeparatedListOutputParser - provides output to users’ input in the form of a list separated by commas.
2. OutputFixingParser - fixes any mistakes or adjusts the output of another output parser to produce an error-free output.
3. PydanticOutputParser - enables users to define an arbitrary JSON schema and effectively query LLMs for producing JSON outputs that align with the specified schema.
4. RetryOutputParser - Designed to handle cases where the initial output fails and generate a better response by taking in the prompt and the original output.
5. Structured Output Parser - enables the extraction and organization of specific information from the model's response based on a predefined schema.
To extract complicated schemas, businesses can use the Kor library. It uses LLM-backed LangChain chains and output parsers to offer deeper extraction of structured information from text.
5. Building Agents to Interact With Different Environments
Agents can leverage the cognitive abilities of LLMs alongside specialized LangChain tools to establish an autonomous framework that can interact with the outside world and is capable of proficiently executing and implementing solutions.
To use LangChain agents, companies can leverage native or build custom tools that would give agents access to the outside world. Also, they can use built-in LangChain agents or customize them to improve their performance and control how they behave.
Businesses can use agents for diverse applications, such as
1. Building custom AI plugins that can retrieve information from many tools based on the user’s input query.
2. Integrating Plug and PLai library with LLMs that offers numerous AI plugins from different categories like shopping, productivity, travel, marketing, and more.
3. Building context-aware AI agents that can understand conversations on a deeper level, such as an AI sales agent that can talk with prospects naturally and automate the activities of a sales representative.
4. Creating multi-modal agents that can use models like OpenAI Dall-E to generate images.
6. Evaluating LLM Output
Evaluating generative models using conventional metrics can be challenging for AI companies due to a lack of data and metrics. It's difficult to evaluate if the generated LLM output is of high quality. However, LangChain offers the following solutions that companies can benefit from:
- LangChainDatasets, a community space intended to provide open-source datasets for evaluating common LangChain agents and chains.
- Tracing, a UI-based visualization tool for tracking and observing the execution of your chain and agent processes.
- Language models to assess outputs utilizing various purpose-built chains and prompts that LangChain offers.
Using LangChain, businesses can evaluate:
- API chains, such as OpenAPI chain, to check if the chain has successfully accessed the endpoint and accomplished the accurate result.
- Question-answering tasks over documents, vector databases, and SQL databases.
7. Querying Tabular Data
Since lots of information is stored in the form of tabular data, businesses can use LangChain to query this data. The process involves using language models to query structured data, including SQL tables, CSV, and Excel sheets.
LangChain offers the following solutions:
- A document loader, such as CSVLoader, to load data into a document, index it, and query it.
- Chains (predetermined LangChain steps) for getting started with querying simple/small tabular data.
- Agents for more complex and larger databases and schemas since they are more powerful and involve multiple queries to the LLM.
LangChain provides a variety of abstract modules that businesses can use to build various applications.
LangChain provides two primary models to work with: Language Model and Embeddings. A language model is designed to understand input text and generate a relevant text in response. Businesses can use these models to build applications with a natural language interface such as language translation tools, virtual assistants, writing assistants,and text-to-image generators etc
An embedding model maps input text onto float vectors. These float vectors, called text embeddings, are a way for computers to understand the text. Embeddings can be used for niche tasks like text clustering and building custom NLP applications for business use cases.
Large Language Models
LLMs are the heart of modern language AI models. These are responsible for understanding input text and generating responses. LangChains `langchain.llm` module provides a simple interface to interact with pre-trained LLMs from providers like OpenAI, Cohere, Hugging Face, etc.
The `llm` object can be used in complex ways, such as passing multiple prompts and getting multiple corresponding responses. They also provide provider-specific information, such as the number of tokens a piece of text will be within a model.
LangChain provides various functionalities for its LLM module, such as:
- Async Support: Process multiple prompts in parallel for faster querying
- Human LLM modules: Mock out calls to the LLM in a test environment to simulate how a human would process a similar input query.
- Stream Responses: LangChain allows streaming responses for OpenAI, ChatOpenAI, and ChatAnthropic APIs.
These capabilities allow businesses to build high-performing, real-time applications for an enhanced user experience. The code snippets and functionalities mentioned are generic for all available LLMs and can be used similarly.
LLMs can also be provided with system messages which help define their roles. For example, you can tell the LLM that it is a finance expert, and it will answer all your queries in that context.
Every piece of text is converted into a numerical vector before it is passed to a language model. These vectors are called embeddings, and there are several ways to calculate them.
Several LLM providers have methods for calculating embeddings, and the LangChain Embeddings module provides a standard interface for all of them. The base Embeddings class has two essential methods: `embed_documents` and `embed_query`. The former works over multiple documents, while the latter processes a single document.
The pieces of text input to a language model are known as prompts. It is because they prompt the model to generate a relevant response. LangChains prompt module allows users to construct useful text prompts that can be used as inputs for LLMs.
The sub-module of interest here is the `PromptTemplate`. This allows users to produce a reusable text template to create custom LLM applications. An example of prompt is
“Tell me a joke about [subject]”
This is a generic prompt that will generate jokes about whatever the subject is specified by the user. This makes the LLM framework reusable and build generic applications.
By default, the input variable uses the Python f-string. However, LangChain also supports the Jinja2 format for web interfaces.
The prompts module also provides functionality to pass multiple prompt examples to help the model output better responses. These are called Few Shot Learning Examples. To utilize these, we import the `FewShotPromptTemplate` sub-module. With this, businesses can specify certain inputs and outputs to help the LLM perform better and output relevant results.
The prompt module can be used to create language models for niche tasks such as writing poetry on a given topic.
For example, a customer service chatbot can be trained to answer queries in a professional format and specific to your business domain. Similarly, an educational chatbot can be trained to answer questions technically. Furthermore, the level of technicality can be defined depending on the education level of the user (school, college, bachelor etc.). In this way, prompts can provide users with a personalized and enhanced experience of your tools.
All the modules provided by LangChain provide different functionality but are of little use independently. The appropriate way to create an LLM is to use different modules for processing the text and then prompting it to the model.
The Chains module helps create a sequential chain of different modules. In this chain, the output of one module becomes the input of the next one. An example chain would be
1. User input:
The user inputs text which contains relevant keywords
2. Text Prompt:
The user input is used as input for the text prompt module to configure a template for the model.
The template is then used as input to the LLM, which generates a relevant response.
The chain creates a pipeline for pre-processing of text and configuration of the LLM. LangChain provides several ways to create and interface LLMChain, the details of which can be found in the documentation.
In a database context, Indexes are used to locate elements within the table. The use remains the same for LLMs, but they maintain text documents instead of structured data points. The index module works in a few steps
1. Document Loading:
It first requires a text document as input to index. LangChain provides functions to load data from various sources in structured and unstructured formats like CSV, docx, images, etc.
2. Text Split:
The loaded text is divided into smaller, semantically meaningful chunks. The chunks are then joined up until a predefined length is reached. This larger chunk is defined as an independent text piece. LangChain supports multiple text splitters, and they can be accessed using relevant modules.
Once the chunks are created, they are indexed appropriately and stored in a VectorStore. The VectorStore is a collection of indexed text that can retrieve documents. LangChain has multiple VectorStores' support e.g Hugging Face.
Overall, LangChain provides an abstract interface to create an indexed vector store without having to go through each step manually. This can be achieved using the `TextLoader` and `VectorstoreIndexCreator` modules.
VectorStores contain vector representations of text data and can be used to build business applications like text classifiers, information retrieval, and chatbots.
Most modern generative language models can remember previous conversations and use that context to answer present queries. This memory element allows AI models to feel natural during conversations and help answer complex queries.
The LangChain memory module allows the LLM to retain its state from previous interactions. It provides two types of memory capabilities: short and long. Short memory retains a state from previous messages from the same conversation, while long memory remains content from different conversations.
LLMs can have far more complex applications beyond plain chatbots. Modern LLMs have the ability to plan and execute actions depending on the user's demand. This opens up new possibilities for several applications, such as educational tutors, programming bots, etc.
Users can ask the AI bot complex science or mathematical questions, and the LLM will provide the relevant answers.
However, these sorts of applications require the LLM to access the relevant tools, such as a mathematics engine, to perform calculations. LangChains agent module helps the model create a chain of relevant tools to perform an action. These agents are also called Action Agents since they figure out and help the LLM perform a certain action.
Some of LangChains built-in agents include
1. Zero-shot-react-description agent:
This agent uses the ReAct framework to figure out the tools required from user input.
2. React-doctstore agent:
This requires two specified tools, one to search for relevant documents and the other to look up keywords and terms in the document.
This agent answers queries by using tools that can look up factual statements, similar to a Google search.
LangChain comes with several toolkits that agents can use to perform different actions. A good example is the python agent, which can be used to write Python programs based on user requirements.
The action agent will use the PythonREPLTool in the above example to create a plan of action. According to the plan, it will first write a Python program to calculate the Fibonacci series. It will then pass the relevant input and return the output to answer the user's query.
Agents can be used to construct all sorts of business applications, like writing assistants and programming assistants.
Build Innovative LLM Applications With Unleashing.ai
Businesses looking to harness the transformative power of LLMs can use the LangChain framework. But they may find it challenging to build business applications due to the complexities of this technology. To overcome such challenges and maximize the potential of LLM-powered business use cases, turning to a trusted AI advisor like Unleashing.ai can prove beneficial.
With our expertise in developing LLM applications, we have boosted revenue for multiple businesses by double digits. Whether it's GPT development or implementing cutting-edge NLP applications, our experts provide the solution and support required to unlock new business opportunities.
Reach out to us to learn how we can help your business embark on a transformative journey in the realm of LLM-powered innovation.