EmbedChain and Chatbots
It’s been a while since my last post as I have been busy with my projects at Mighty Jaxx, especially related to the e-commerce side of things. I recently had a good chat with Kestrel Lee who is a Creative Director based in China and he brought up a point of using LLMs to create chatbots to serve as the first line of customer support for common customer queries.
So instead of hiring an army of customer service agents, you can reduce it to a small number of real life customer support agents who will answer more complicated queries while the LLMs based chatbots will handle the most mundane queries. It’s just like now no one uses pen and paper to calculate complex formulas when they can just use calculators to do it for them.
So I was looking at LangChain as it is 1 of most popular framework for creating LLM based applications including Chat bots. And recently, I came across a chat message in Telegram where someone mentioned EmbedChain as a new Python framework which allows you to create LLM powered bots easily in Python or JS. So I decided to give it a try and see how EmbedChain stacks up against LangChain for my LLM chat bot.
The Git repo for EmbedChain can be found at https://github.com/embedchain/embedchain and the README.md file says “Embedchain is a framework to easily create LLM powered bots over any dataset. If you want a javascript version, check out embedchain-js”. That sounded pretty good for my use case.
So, I decided to give EmbedChain a try and add in a PDF of my resume for my simple chatbot. To create a simple UI, I chose Streamlit as the UI package for my chatbot. The initialization of the code comprised of just a few simple steps
First, I added a few simple imports
import streamlit as st #use streamlit for the UI
import os
from embedchain import App #need this for EmbedChain
Next, I added in my ChatGPT API key as that is the LLM that I will use
# Create a bot instance
os.environ["OPENAI_API_KEY"] = "" #Add in your OpenAI API key here
EmbedChain uses the OpenAI LLM by default, but it supports other LLMs such as Llama2, Cohere etc. You can check https://docs.embedchain.ai/advanced/app_types for a list of supported LLMs
Finally, I create an instance of the app
gibson_bot = App()
Next, I will need to load in the data sources. EmbedChain supports a wide variety of data sources such PDF files, Youtube videos, web pages etc. A full list of supported data sources can be found at https://docs.embedchain.ai/data-sources/csv.
For this example, I will add in a PDF which is downloaded from my Linkedin profile.
gibson_bot.add("datasource/Profile.pdf", data_type="pdf_file");
Notice that I do not need to specify any code to do embedding and chunking into a vector database to store the data. This is because when you add a data source. EmbedChain does the embedding and chunking into a default vector database for me automatically.
And EmbedChain supports the open source vector database ChromaDB by default https://www.trychroma.com/
After a successful addition of a data source, EmbedChain will automatically create a /db folder where it stores the embedded data source.
Next, I will put in the Streamlit code
if st.button('Submit query'):
wait_text = st.title("Please wait ...")
print("Submitting query '" + question + "'")
answer = gibson_bot.query(question)
wait_text.empty()
print(answer)
st.title(answer)
And that is it. My chatbot is complete. To run it, I will type
python3 -m streamlit run ui.py
in my Terminal to start my chatbot.
Here is a demo of the chatbot in action where it is able to pull my personal details from my resume and then respond to me in a Q&A format
My code is available here at
https://github.com/gibtang/embedchain-demo
as a reference if you need it.
From how I see it. AI and LLMs have the potential to augment our work and deliver more value to the customers, but it will not replace us entirely as AI/LLMs are a tool to be used. And a simple chatbot is just 1 of the many tools that we can use to delight customers.