The goal of this project is to develop a domain-specific application that combines the strengths of a Large Language Model (LLM) with the efficiency of a vector database for data storage and retrieval. Using Retrieval-Augmented Generation (RAG) for the method and Streamlit for the front-end, the application is built with Python.
- Frontend: Streamlit for building the user interface.
- Vector Database: Pinecone for efficient data storage and retrieval.
- LLM: OpenAI model for natural language processing and query handling.
- Backend: LangChain framework utilizing the RAG method.
- src/: Contains Python-based chatbot script and Streamlit main script.
- src/materials/: Contains data that our model will use to answer questions.
- report/: Stores Report files.
- video/: Contains video presentation. You can also watch the video on YouTube.
- .env: Contains API keys.
- Python 3.7+
- langchain
- pinecone-client
- python-dotenv
- streamlit
- pypdf
- Clone the repository:
git clone https://github.com/Faridghr/Simple-RAG-Chatbot.git
- Navigate to the project directory:
cd Simple-RAG-Chatbot
- Install dependencies:
pip install -r requirements.txt
- Set up your LLM.
- Set up your Pinecone API key in
.env
file. - Navigate to src directory:
cd src
- Run the Streamlit application:
streamlit run streamlitMain.py
- Open your web browser and navigate to the URL provided by Streamlit (usually http://localhost:8501).
- Interact with the chatbot by typing messages and receiving responses from the local LLM service.
- Enter our OpenAI account and navigate to OpenAI Platform.
- Navigate to the API section.
- Proceed to create a new API key by pressing '+ Create' new secret key.
- Select a suitable name to remember and press the Create secret key button.
- Copy the secret key and add your OpenAI API Keys in a file called
.env
.
- To create a PineCone account, sign up via this link: Pinecone
- After registering with the free tier, go into the project, and click on Create a Projec.
- Fill in the Project Name, Cloud Provider, and Environment. In this case, I have used “SimpleRAGChatbot Application” as a Project Name, GCP as Cloud Provider, and Iowa (gcp-starter) as an Environment.
- After the project is created, go into the API Keys section, and make sure you have an API key available. Do not share this API key.
- After completing the account setup, you can add your Pinecone API Keys in a file called
.env
.