Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dropbox rehaul #5

Merged
merged 14 commits into from
Dec 5, 2023
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
3 changes: 3 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
/**/dev.py
/**/dev/
/**/docker-compose.yml/
28 changes: 28 additions & 0 deletions .github/ISSUE_TEMPLATE/bug.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: Bug in existing connector
description: Raise a bug related to an existing connector
labels:
- bug
body:
- type: textarea
attributes:
label: Which connector is affected?
description: Name of connector.
validations:
required: true

- type: textarea
attributes:
label: What is the issue?
description: |
- Give as much detail as you can to help us understand the bug.
- Include any error messages or error codes.
- Try to add reproduction steps if possible.
validations:
required: true

- type: textarea
attributes:
label: Additional information
description: Any other context, images or comments to add.
validations:
required: false
8 changes: 8 additions & 0 deletions .github/ISSUE_TEMPLATE/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
blank_issues_enabled: false
contact_links:
- name: Documentation
url: https://docs.cohere.com/docs
about: For detailed information about Cohere's API visit the documentation page.
- name: General Community Help
url: https://discord.com/invite/co-mmunity
about: For any general question not related to connectors please visit Cohere's Community's Discord.
27 changes: 27 additions & 0 deletions .github/ISSUE_TEMPLATE/improvement.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
name: Improve existing connector
description: Make a suggestion to improve the code in an existing connector.
labels:
- improvement
body:
- type: textarea
attributes:
label: Which connector is affected?
description: Name of connector.
validations:
required: true

- type: textarea
attributes:
label: What would you like to see improved?
description: |
- Give as much detail as you can to help us understand the change.
- Why should these changes be made, and what is the expected change in behavior versus the current one?
validations:
required: true

- type: textarea
attributes:
label: Additional information
description: Any other context, images or comments to add.
validations:
required: false
26 changes: 26 additions & 0 deletions .github/ISSUE_TEMPLATE/suggest-new-connector.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: Suggest new connector
description: Suggest a new connector to add
labels:
- addition
body:
- type: textarea
attributes:
label: What platform/tool would you like to add as a connector?
description: Name of the platform or tool to add.
validations:
required: true

- type: textarea
attributes:
label: Why do you want to add this connector?
description: Give us details as to why you think this connector would be valuable for your use-case or for the community.
validations:
required: true

- type: textarea
attributes:
label: Additional information
description: |
- Any other information is much appreciated, such as API documentation, Python SDKs, etc.
validations:
required: false
11 changes: 11 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
<!--
Thanks for contributing to Cohere's Search Connectors. Please fill out as much information in this template as you can so we can review the changes made. If you are a new contributor, please make sure you've read our CONTRIBUTING file located in the root directory.
-->

### What's being changed:

<!-- Please link to an existing issue here, if exists. -->

### How did you test this change (include any code snippets, API requests, screenshots, or gifs):

<!-- Please include details of your testing here. Such as screenshots of your terminal, copy/paste of a request response, etc. -->
10 changes: 10 additions & 0 deletions .github/workflows/format.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
name: Format

on: [push, pull_request]

jobs:
format:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: psf/black@stable
21 changes: 21 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
.venv/
.vscode/
**/dev/data/
*.pyc
.trunk/
.DS_Store
milvus/volumes/

# ENV
.env
credentials.json
token.json
.zuliprc

# Python Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# Deployment artifacts
fly.toml
56 changes: 56 additions & 0 deletions .openapi/api.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
openapi: 3.0.3
info:
title: Search Connector API
version: 0.0.1
paths:
/search:
post:
description: >-
<p>Searches the connected data source for documents related to the query and returns a set of key-value pairs representing the found documents.</p>
operationId: search
summary: Perform a search
security:
- api_key: []
requestBody:
required: true
content:
application/json:
schema:
type: object
required:
- query
properties:
query:
description: >-
A plain-text query string to be used to search for relevant documents.
type: string
minLength: 1
example:
query: embeddings
responses:
"200":
description: Successful response
content:
application/json:
schema:
type: object
properties:
results:
type: array
items:
type: object
additionalProperties:
type: string
"400":
description: Bad request
"401":
description: Unauthorized
default:
description: Error response

components:
securitySchemes:
api_key:
type: http
scheme: bearer
x-bearerInfoFunc: provider.app.apikey_auth
31 changes: 31 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Cohere Quick Start Connectors contributing guide

Thank you for your interest in contributing to Cohere's connector. This guide will explain the contribution workflow from opening an issue, creating a PR, to reviewing and merging the PR.

# Getting Started

Remember that there are many ways to contribute other than writing code: writing tutorials or blog posts, improving [the documentation](https://docs.cohere.com), and submitting bug reports.

## Table of Contents

- [Assumptions](#assumptions)
- [How to Contribute](#how-to-contribute)
- [Development Workflow](#development-workflow)
- [Git Guidelines](#git-guidelines)
- [Release Process (for internal team only)](#release-process-for-internal-team-only)

## Assumptions

1. **You're familiar with [GitHub](https://github.com) and the [Pull Requests (PR)](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/about-pull-requests) workflow.**
2. **You've read the Cohere's [documentation](https://docs.cohere.com).**
3. \*\*You know about the [Cohere community on Discord](https://discord.com/invite/co-mmunity).

## How to Contribute

1. Ensure your change has an issue! Find an
existing issue or open a new issue.
- This is where you can get a feel if the change will be accepted or not.
2. Once approved, fork this repository repository in your own GitHub account.
3. [Create a new Git branch](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-and-deleting-branches-within-your-repository)
4. Make your changes on your branch.
5. [Submit the branch as a Pull Request](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request-from-a-fork) pointing to the `main` branch this repository. A maintainer should comment and/or review your Pull Request within a few days. Although depending on the circumstances, it may take longer.
48 changes: 48 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# syntax = docker/dockerfile:1.4.0
FROM python:3.11-slim-bookworm as operating-system-deps
WORKDIR /app

# Keeps Python from generating .pyc files in the container
# Turns off buffering for easier container logging
# Force UTF8 encoding for funky character handling
# Needed so imports function properly
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV PYTHONIOENCODING=utf-8
ENV PYTHONPATH=/app
# Keep the venv name and location predictable
ENV POETRY_VIRTUALENVS_IN_PROJECT=true
# Control the number of workers Gunicorn uses
ENV WEB_CONCURRENCY=1

# "Activate" the venv manually for the context of the container
ENV VIRTUAL_ENV=/app/.venv
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

RUN apt-get update && \
apt-get install -y --no-install-recommends \
build-essential=* \
libpq-dev=* && \
rm -rf /var/lib/apt/lists/*

FROM operating-system-deps
ARG app
ARG port=8000

COPY ./${app}/pyproject.toml ./${app}/poetry.lock /app/
RUN pip install --no-cache-dir poetry==1.5.1 && \
poetry install

COPY ./.openapi /app/.openapi
COPY ${app} ./${app}

# use a RUN to create an entrypoint because the ENTRYPOINT directive does not
# support variable substitution
RUN <<EOF cat >> /app/entrypoint.sh && chmod +x /app/entrypoint.sh
#!/usr/bin/env bash
gunicorn -b 0.0.0.0:${port} -t 240 --preload "provider:create_app()"
EOF

WORKDIR /app/${app}
EXPOSE ${port}
ENTRYPOINT ["/app/entrypoint.sh"]
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2023 Cohere

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
73 changes: 72 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,72 @@
# Quick Start Connectors
**Quick Start Connectors**

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
![](https://img.shields.io/badge/PRs-Welcome-red)

---

# Table of Contents

- [Table of Contents](#table-of-contents)
- [Overview](#overview)
- [Features](#features)
- [Getting Started](#getting-started)
- [Contributing](#contributing)

# Overview

Cohere's Build-Your-Own-Connector framework allows you to integrate Cohere's Command LLM via the [co.chat api endpoint](https://docs.cohere.com/reference/chat) to any datastore/software that holds text information and has a corresponding search endpoint exposed in its API. This allows the Commanad model to generated responses to user queries that are grounded in proprietary information.

Some examples of the use-cases you can enable with this framework:

* Generic question/answering around broad internal company docs
* Knowledge working with specific sub-set of internal knowledge
* Internal comms summary and search
* Research using external providers of information, allowing researchers and writers to explore to information from 3rd parties

This open-source repository contains code that will allow you to get started integrating with some of the most popular datastores. There is also an [empty template connector](https://github.com/cohere-ai/quick-start-connectors/tree/main/template) which you can expand to use any datasource. Note that different datastores may have different requirements or limitations that need to be addressed in order to to get good quality responses. While some of the quickstart code has been enhanced to address some of these limitations, others only provide the basics of the integration, and you will need to develop them further to fit your specific use-case and the underlying datastore limitations.

Please read more about our connectors framework here: LINK TO DOCS

# Getting Started

This project requires Python 3.11+ and [Poetry](https://python-poetry.org/docs/) at a minimum. Each connector uses poetry to create a virtual environment specific to that connector, and to install all the required dependencies to run a local server.

For production releases, you can optionally build and deploy using [Docker](https://www.docker.com/get-started/). When building a Docker image, you can use the `Dockerfile` in the root project directory and specify the `app` build argument.

# Development

For development, refer to a connector's README. Generally, there is an `.env` file that needs to be created in that subdirectory, based off of a `.env-template`. The environment variables here most commonly set authorization values such as API keys, credentials, and also modify the way the search for that connector behaves.

After configuring the `.env`, you will be able to use `poetry`'s CLI to start a local server.

# Integrating With Cohere

All of the connectors in this repository have been tailored to integrate with Cohere's [co.chat](https://docs.cohere.com/reference/chat) API to make creating a grounded chatbot quick and easy.

Cohere's API requires that connectors return documents as an array of JSON objects. Each document should be an object with string keys and string values containing all the relevant information about the document (e.g. `title`, `url`, etc.). For best results the largest text content should be stored in the `text` key.

For example, a connector that returns documents about company expensing policy might return the following:

```json
[
{
"title": "Company Travel Policy",
"text": "Flights, Hotels and Meals can be expensed using this new tool...",
"url": "https://drive.google.com/file/d/id1"
"created_at": "2023-11-25T20:09:31Z",
},
{
"title": "2023 Expenses Policy",
"text": "The list of recommended hotels are",
"url": "https://drive.google.com/file/d/id2"
"created_at": "2022-11-22T20:09:31Z",
}
]
```

Cohere's [co.chat](https://docs.cohere.com/reference/chat) API will query the connector and use these documents to generated answers with direct citations.

# Contributing

Contributions are what drive an open source community, any contributions made are greatly appreciated. For specific. To get started, check out our [documentation.](CONTRIBUTING.md)
1 change: 1 addition & 0 deletions _template_/.env-template
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
TEMPLATE_CONNECTOR_API_KEY=
Loading