bukosabino · bukosabino · Mar 22, 2024 · Mar 21, 2024 · Mar 22, 2024
diff --git a/config/qdrant_local.yaml b/config/qdrant_local.yaml
@@ -0,0 +1,3 @@
+service:
+  api_key: 823e071f67c198cc05c73f8bd4580865e6a8819a1f3fe57d2cd49b5c892a5233
+  read_only_api_key: d1aab4f05ae4fd7f4e4b8d9e5924469494ebb7897aed46cf2b0df0915410e0b0
diff --git a/doc/deployment_guide.md b/doc/deployment_guide.md
@@ -1,52 +1,35 @@
-# How to deploy the service in a remote/cloud computer
+# How to deploy the service in local
 
-## 1. Prepare your vector database
+## 1. Prepare your vector database in local
 
-At this moment we are working with pinecone as vector database, so, please create an account and an index. Check [the pinecone documentation](https://docs.pinecone.io/docs/overview)
+At this moment, we are working with Qdrant as vector database.
 
-Once you have your pinecone index, please update the `config/config.yaml` :
+Official doc: https://qdrant.tech/documentation/quick-start/
 
-* vector_store: use the name of the pinecone index that you choose.
-
-Export environment variables:
+### Download the latest Qdrant image from Dockerhub:
 
 ```
-export APP_PATH="."
-export SENDGRID_API_KEY=<your_sendgrid_api_key>
-export OPENAI_API_KEY=<your_open_api_key>
-export TOKENIZERS_PARALLELISM=false
-export TAVILY_API_KEY=<your_tavily_api_key>
-export QDRANT_API_KEY="<your_qdrant_api_key>"
-export QDRANT_API_URL="<your_qdrant_api_url>"
+docker pull qdrant/qdrant
 ```
 
-Load BOE documents into your vector database (depending on the selected data, may take a few minutes)
+### Run the service:
 
 ```
-python -m src.etls.boe.load dates collection_name 2024/01/01 2024/01/31
+docker run -p 6333:6333 -p 6334:6334 -v $(pwd)/justicio/config/qdrant_local.yaml:/qdrant/config/production.yaml -v $(pwd)/qdrant_storage:/qdrant/storage:z qdrant/qdrant
 ```
 
-If you want to update the vector database on a daily basis (BOE publishes new documents every day), run this file as a scheduled job (e.g. with CRON).
+* REST API: localhost:6333
+* Web UI: localhost:6333/dashboard
 
-```
-python -m src.etls.boe.load today collection_name
-```
+## 2. Prepare Justicio
 
-If you want to update the vector database on a daily basis (BOE publishes new documents every day), run this file with schedule:
+### Clone the code:
 
 ```
-python -m src.etls.boe.schedule
+git clone git@github.com:bukosabino/justicio.git
 ```
 
-## 2. Deploy the service
-
-Clone the code:
-
-```
-git clone git@github.com:bukosabino/ia-boe.git
-```
-
-Install the requirements:
+### Install the requirements:
 
 ```
 sudo apt install python3-virtualenv
@@ -55,33 +38,36 @@ source venv3.10/bin/activate
 pip install -r requirements.txt
 ```
 
-Export environment variables:
+### Export environment variables:
+
+Note: You need to get an API key for OpenAI and another for Sendgrid.
 
 ```
 export APP_PATH="."
 export SENDGRID_API_KEY=<your_sendgrid_api_key>
 export OPENAI_API_KEY=<your_open_api_key>
 export TOKENIZERS_PARALLELISM=false
-export TAVILY_API_KEY=<your_tavily_api_key>
-export QDRANT_API_KEY="<your_qdrant_api_key>"
-export QDRANT_API_URL="<your_qdrant_api_url>"
+export TAVILY_API_KEY=""
+export QDRANT_API_KEY="823e071f67c198cc05c73f8bd4580865e6a8819a1f3fe57d2cd49b5c892a5233"
+export QDRANT_API_URL="http://localhost:6333"
 ```
 
-Run the service
+### Add some vector to the vector database
+
+Load BOE documents into your vector database (depending on the selected data, may take a few minutes).
 
 ```
-nohup uvicorn src.service.main:APP --host=0.0.0.0 --port=5001 --workers=2 --timeout-keep-alive=125 --log-level=info > logs/output.out 2>&1 &
+python -m src.etls.boe.load dates 2024/01/01 2024/01/07
 ```
 
-In the browser
+## 3. Run Justicio in local
 
 ```
-http://<your.ip>:5001/docs
+uvicorn src.service.main:APP --host=0.0.0.0 --port=5001 --workers=1 --timeout-keep-alive=125 --log-level=info
 ```
 
-Monitor the logs of the system
+In the browser
 
 ```
-tail -n 20 output.out
-tail -f output.out
-```
+http://<your.ip>:5001/docs
+```
diff --git a/requirements.txt b/requirements.txt
@@ -16,7 +16,7 @@ schedule==1.2.1
 langchain==0.0.305
 # langchainplus-sdk==0.0.20
 langsmith==0.0.41
-qdrant-client==1.5.4
+qdrant-client==1.8.0
 supabase==1.0.2
 pinecone-client==2.2.2
 sentence_transformers==2.2.2

diff --git a/src/initialize.py b/src/initialize.py
@@ -156,10 +156,10 @@ def _init_openai_client():
     return client
 
 
-def _exists_collection(client, collection_name):
+def _exists_collection(qdrant_client, collection_name):
     logger = lg.getLogger(_exists_collection.__name__)
     try:
-        client.get_collection(collection_name=collection_name)
+        qdrant_client.get_collection(collection_name=collection_name)
         return True
     except:
         logger.warn("Collection [%s] doesn't exist", collection_name)