Skip to content

Commit

Permalink
Dropbox rehaul (#5)
Browse files Browse the repository at this point in the history
* dropbox updates

* add gunicorn

* README wip

* Update dropbox readme

* PR Comments

---------

Co-authored-by: walterbm-cohere <walter@cohere.com>
  • Loading branch information
tianjing-li and walterbm-cohere authored Dec 5, 2023
1 parent 01c6181 commit 66a86c3
Show file tree
Hide file tree
Showing 9 changed files with 1,104 additions and 1 deletion.
2 changes: 1 addition & 1 deletion confluence/.env-template
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,4 @@ CONFLUENCE_SPACE_NAME=
CONFLUENCE_SEARCH_LIMIT=10

# Connector Authorization
CONFLUENCE_CONNECTOR_API_KEY=
CONFLUENCE_CONNECTOR_API_KEY=
6 changes: 6 additions & 0 deletions dropbox/.env-template
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
DROPBOX_ACCESS_TOKEN=
DROPBOX_APP_KEY=
DROPBOX_APP_SECRET=
DROPBOX_SEARCH_LIMIT=5
DROPBOX_PATH=
DROPBOX_CONNECTOR_API_KEY=
89 changes: 89 additions & 0 deletions dropbox/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
# Dropbox Quick Start Connector

This package is a utility for connecting Cohere to Dropbox, featuring a simple local development setup.

## Limitations

The Dropbox connector currently searches for all active files within your Dropbox instance. Note that new files added will require a couple minutes of indexing time to be searchable. Dropbox usually takes less than 5 minutes.

Currently, file contents are not decoded.

## Configuration

To use the Dropbox connector, first create an app in the [Developer App Console](https://www.dropbox.com/developers/apps). Select Scoped Access, and give it the access type it needs. Note that `App folder` access will give your app access to a folder specifically created for your app, while `Full Dropbox` access will give your app access to all files and folders currently in your Dropbox instance.

Once you have created a Dropbox app, head over to the Permissions tab of your app and enable `files.metadata.read` and `files.content.read`. Then go to the Settings tab and retrieve your App key and App secret and place them into a `.env` file (see `.env-template` for reference):

```
DROPBOX_APP_KEY=xxxx
DROPBOX_APP_SECRET=xxxx
```

Optionally, you can configure the `DROPBOX_PATH` to modify the subdirectory to search in, or the `DROPBOX_SEARCH_LIMIT` to affect the max number of results returned.

## Authentication

#### Testing

To test the connection, you can generate a temporary access token from your App's settings page. Use this for the `DROPBOX_ACCESS_TOKEN` environ variable.

#### `DROPBOX_CONNECTOR_API_KEY`

The `DROPBOX_CONNECTOR_API_KEY` should contain an API key for the connector. This value must be present in the `Authorization` header for all requests to the connector.

#### OAuth

When using OAuth for authentication, the connector does not require any additional environment variables. Instead, the OAuth flow should occur outside of the Connector and Cohere's API will forward the user's access token to this connector through the `Authorization` header.

With OAuth the connector will be able to search any Dropbox folders and files that the user has access to.

To configure OAuth, follow the same steps in the Configuration section to create a Dropbox App. You will also need to register a redirect URI on that app to `https://api.cohere.com/v1/connectors/oauth/token`.

You can then register the connector with Cohere's API using the following configuration:
Note: Your App key and App secret values correspond to `client_id` and `client_secret` respectively.

```bash
curl -X POST \
'https://api.cohere.ai/v1/connectors' \
--header 'Accept: */*' \
--header 'Authorization: Bearer {COHERE-API-KEY}' \
--header 'Content-Type: application/json' \
--data-raw '{
"name": "Dropbox with OAuth",
"url": "{YOUR_CONNECTOR-URL}",
"oauth": {
"client_id": "{DROPBOX-OAUTH-CLIENT-ID}",
"client_secret": "{DROPBOX-OAUTH-CLIENT-SECRET}",
"authorize_url": "https://www.dropbox.com/oauth2/authorize",
"token_url": "https://www.dropbox.com/oauth2/token"
}
}'
```

## Development

Create a virtual environment and install dependencies with poetry. We recommend using in-project virtual environments:

```bash
poetry config virtualenvs.in-project true
poetry install --no-root
```

To run the Flask server in development mode, please run:

```bash
poetry run flask --app provider --debug run
```

The Flask API will be bound to :code:`localhost:5000`.

```bash
curl --request POST \
--url http://localhost:5000/search \
--header 'Content-Type: application/json' \
--data '{
"query": "charcoal"
}'
```

Alternatively, load up the Swagger UI and try out the API from a browser: http://localhost:5000/ui/
830 changes: 830 additions & 0 deletions dropbox/poetry.lock

Large diffs are not rendered by default.

34 changes: 34 additions & 0 deletions dropbox/provider/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import logging
import os

import connexion # type: ignore
from dotenv import load_dotenv


load_dotenv()


API_VERSION = "api.yaml"


class UpstreamProviderError(Exception):
def __init__(self, message) -> None:
self.message = message

def __str__(self) -> str:
return self.message


def create_app() -> connexion.FlaskApp:
app = connexion.FlaskApp(__name__, specification_dir="../../.openapi")
app.add_api(
API_VERSION, resolver=connexion.resolver.RelativeResolver("provider.app")
)
logging.basicConfig(level=logging.INFO)
flask_app = app.app
config_prefix = os.path.split(os.getcwd())[
1
].upper() # Current directory name, upper-cased
flask_app.config.from_prefixed_env(config_prefix)
flask_app.config["APP_ID"] = config_prefix
return flask_app
35 changes: 35 additions & 0 deletions dropbox/provider/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
import logging

from connexion.exceptions import Unauthorized
from flask import abort, current_app as app, request

from . import UpstreamProviderError, provider

logger = logging.getLogger(__name__)

AUTHORIZATION_HEADER = "Authorization"
BEARER_PREFIX = "Bearer "


def get_oauth_token() -> str | None:
authorization_header = request.headers.get(AUTHORIZATION_HEADER, "")
if authorization_header.startswith(BEARER_PREFIX):
return authorization_header.removeprefix(BEARER_PREFIX)
return None


def search(body):
try:
data = provider.search(body["query"], get_oauth_token())
except UpstreamProviderError as error:
logger.error(f"Upstream search error: {error.message}")
abort(502, error.message)
return {"results": data}, 200, {"X-Connector-Id": app.config.get("APP_ID")}


def apikey_auth(token):
api_key = app.config.get("CONNECTOR_API_KEY", "")
if api_key != "" and token != api_key:
raise Unauthorized()
# successfully authenticated
return {}
57 changes: 57 additions & 0 deletions dropbox/provider/client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
from dropbox import Dropbox
from dropbox.exceptions import AuthError
from dropbox.files import FileStatus, SearchOptions # type: ignore
from flask import current_app as app

from . import UpstreamProviderError


class DropboxClient:
def __init__(self, token, search_limit, path):
self.search_limit = search_limit
self.path = path
self.client = Dropbox(token)

# Test connection
try:
self.client.users_get_current_account()
except AuthError:
raise UpstreamProviderError(
"ERROR: Invalid access token; try re-generating an "
"access token from the app console on the web."
)

def search(self, query):
results = self.client.files_search_v2(
query,
SearchOptions(
file_status=FileStatus.active,
filename_only=False,
max_results=self.search_limit,
path=self.path,
),
include_highlights=False,
)

return results

def download_file(self, path):
metadata, file = self.client.files_download(path)

return metadata, file


def get_client(oauth_token=None):
search_limit = app.config.get("SEARCH_LIMIT", 5)
path = app.config.get("PATH", "")
env_token = app.config.get("ACCESS_TOKEN", "")
token = None

if env_token != "":
token = env_token
elif oauth_token is not None:
token = oauth_token
else:
raise AssertionError("No access token or Oauth credentials provided.")

return DropboxClient(token, search_limit, path)
28 changes: 28 additions & 0 deletions dropbox/provider/provider.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
from typing import Any

from .client import get_client


def search(query: str, oauth_token: str = None) -> list[dict[str, Any]]:
dbx_client = get_client(oauth_token)
dbx_results = dbx_client.search(query)

results = []
for dbx_result in dbx_results.matches:
if not (metadata := dbx_result.metadata.get_metadata()):
continue

if not getattr(metadata, "is_downloadable", False):
continue

metadata, f = dbx_client.download_file(metadata.path_display)

result = {
"type": "file",
"title": metadata.name,
"text": str(f.content),
}
# TODO: decode file contents
results.append(result)

return results
24 changes: 24 additions & 0 deletions dropbox/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
[tool.poetry]
name = "dropbox-connector"
version = "0.1.0"
description = "Search provider for connecting Cohere with Dropbox."
authors = ["Scott Mountenay <scott@lightsonsoftware.com>"]
readme = "README.md"

[tool.poetry.dependencies]
python = "^3.11"
flask = "2.2.5"
connexion = {extras = ["swagger-ui"], version = "^2.14.2"}
python-dotenv = "^1.0.0"
dropbox = "^11.36.2"
requests = "^2.31.0"
gunicorn = "^21.2.0"


[tool.poetry.group.development.dependencies]
black = "^23.7.0"
mypy = "^1.4.1"

[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

0 comments on commit 66a86c3

Please sign in to comment.