Skip to content

Commit

Permalink
feat: better docs and DX improvements (#6)
Browse files Browse the repository at this point in the history
  • Loading branch information
chapati23 authored Jul 30, 2024
1 parent 243a7c7 commit 6e7a988
Show file tree
Hide file tree
Showing 15 changed files with 346 additions and 157 deletions.
15 changes: 11 additions & 4 deletions .env.example
Original file line number Diff line number Diff line change
@@ -1,11 +1,18 @@
# YOU DO NOT NEED TO CREATE A .env FILE MANUALLY
# Terraform will automatically create a local .env file with the required environment variables when you run `terraform apply`.
# This file is just to illustrate the required environment variables for the function to work locally.

# Required for the function to be able to look up the Discord Webhook URL in GCP Secret Manager.
# Get it via `gcloud projects list --filter="name:governance-watchdog*" --format="value(projectId)")`
# You can check it manually via `gcloud projects list --filter="name:governance-watchdog*" --format="value(projectId)"`
GCP_PROJECT_ID=

# Required for the function to be able to look up the Discord Webhook URL and Telegram Bot Token in GCP Secret Manager.
# Get it via `gcloud secrets list`
# Required for the function to be able to look up secrets in GCP Secret Manager.
# You can check it manually via `gcloud secrets list`
DISCORD_WEBHOOK_URL_SECRET_ID=
TELEGRAM_BOT_TOKEN_SECRET_ID=

# Get it via inviting @MissRose_bot to the telegram group and then using the `/id` command (please remove the bot after you're done)
# You can check it manually either via
# a) `terraform state show "google_cloudfunctions2_function.watchdog_notifications" | grep TELEGRAM_CHAT_ID | awk -F '= ' '{print $2}' | tr -d '"'`
# OR
# b) inviting @MissRose_bot to the telegram group and then using the `/id` command (please remove the bot after you're done)
TELEGRAM_CHAT_ID=
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,7 @@ errored.tfstate
.env.yaml
dist/
function-source.zip
node_modules/
node_modules/

# Local Stuff
.project_vars_cache
223 changes: 128 additions & 95 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,18 +8,20 @@ A monorepo for our governance watchdog, a system that monitors Mento Governance

![Architecture Diagram](arch-diagram.png)

- [Requirements](#requirements)
- [Local Development of Cloud Function Code](#local-development-of-cloud-function-code)
- [Requirements for local development](#requirements-for-local-development)
- [Local Infra Setup (when project is deployed already)](#local-infra-setup-when-project-is-deployed-already)
- [Running and testing the Cloud Function locally](#running-and-testing-the-cloud-function-locally)
- [Testing the Deployed Cloud Function](#testing-the-deployed-cloud-function)
- [Infra Setup (when project is deployed already)](#infra-setup-when-project-is-deployed-already)
- [First Time Infra Deployment via Terraform](#first-time-infra-deployment-via-terraform)
- [Updating the Cloud Function](#updating-the-cloud-function)
- [Infra Deployment via Terraform](#infra-deployment-via-terraform)
- [Google Cloud Permission Requirements](#google-cloud-permission-requirements)
- [Deployment from scratch](#deployment-from-scratch)
- [Migrate Terraform State to Google Cloud](#migrate-terraform-state-to-google-cloud)
- [Updating the Cloud Function](#updating-the-cloud-function)
- [Debugging Problems](#debugging-problems)
- [View Logs](#view-logs)
- [Teardown](#teardown)

## Requirements
## Requirements for local development

1. Install the `gcloud` CLI

Expand Down Expand Up @@ -51,6 +53,15 @@ A monorepo for our governance watchdog, a system that monitors Mento Governance
# For other systems, see https://developer.hashicorp.com/terraform/install
```

1. Install `jq` (used in a few shell scripts)

```sh
# On macOS
brew install jq

# For other systems, see https://jqlang.github.io/jq/
```

1. Authenticate with Google Cloud default credentials in your local shell

```sh
Expand All @@ -70,46 +81,8 @@ A monorepo for our governance watchdog, a system that monitors Mento Governance
1. A Telegram group to send notifications to

1. A Telegram bot must be in the group to receive the notifications.
If you're doing this from scratch, here's how to create a bot

- Open a new chat with @BotFather
- Use the `/newbot` command to create a new bot
- Copy the API key printed out at the end of the prompt and store it in your `terraform.tfvars`

```hcl
telegram_bot_token = "<bot-api-key>"
```
- Get the Chat ID by inviting @MissRose_bot to the group and then using the `/id` command
- Add the Chat ID to your `terraform.tfvars`
```hcl
telegram_chat_id = "<group-chat-id>"
```
- Remove @MissRose_bot after you got the Chat ID
## Local Development of Cloud Function Code
- `npm install` (couldn't use `pnpm` because Google Cloud Build failed trying to install pnpm at the time of writing)
- `cp .env.example .env` and fill in the required values (there are comments in the `.env.example` explaining how to get them)
- `npm start` to start a local cloud function
- `npm test` to call the local cloud function with a mocked payload, this will send a real Discord message into channel belonging to the webhook in `.env`:
```sh
curl -H "Content-Type: application/json" -d @src/proposal-created.fixture.json localhost:8080
```

## Testing the Deployed Cloud Function

You can test the deployed cloud function manually by using the `proposal-created.fixture.json` which contains a similar payload to what a QuickAlert would send to the cloud function:

```sh
./test-deployed-function.sh
# or `npm run test-in-prod` if you prefer npm to call this script
```

## Infra Setup (when project is deployed already)
## Local Infra Setup (when project is deployed already)

1. Set your local `gcloud` project to the watchdog project:

Expand All @@ -124,11 +97,11 @@ You can test the deployed cloud function manually by using the `proposal-created
terraform init
```

1. Create a `terraform.tfvars` file in the `./infra` folder, this is like `.env` for Terraform:
1. While inside the `infra` folder, create `terraform.tfvars` file. This is like `.env` for Terraform:

```sh
touch ./infra/terraform.tfvars
# This file should be `.gitignore`d to avoid accidentally leaking sensitive data
touch terraform.tfvars
# This file is `.gitignore`d to avoid accidentally leaking sensitive data
```

1. Add the following values to your `terraform.tfvars`, you can look up all values in the Google Cloud console (or ask another dev to share his local `terraform.tfvars` with you)
Expand All @@ -147,24 +120,89 @@ You can test the deployed cloud function manually by using the `proposal-created
group_billing_admins = "<our-billing-admins-group>"
```

1. Add the Discord Webhook URL from Google Cloud Secret Manager into your local `terraform.tfvars`:
1. Add the Discord Webhook URL from Google Cloud Secret Manager to your local `terraform.tfvars`:

```sh
# You will need the "Secret Manager Secret Accessor" IAM role for this command to succeed
# You need the "Secret Manager Secret Accessor" IAM role for this command to succeed
echo "discord_webhook_url = \"$(gcloud secrets versions access latest --secret discord-webhook-url)\"" >> terraform.tfvars
```

1. Add the Telegram Bot Token and Chat ID to your local `terraform.tfvars`

```sh
# Get the chat ID from cloud function's terraform state
echo "\ntelegram_chat_id = \"$(terraform state show "google_cloudfunctions2_function.watchdog_notifications" | grep TELEGRAM_CHAT_ID | awk -F '= ' '{print $2}' | tr -d '"')\"" >> terraform.tfvars

# Get the bot token from secret manager (you need the "Secret Manager Secret Accessor" IAM role for this command to succeed)
echo "telegram_bot_token = \"$(gcloud secrets versions access latest --secret telegram-bot-token)\"" >> terraform.tfvars
```

1. [Get our QuickNode API key from the QuickNode dashboard](https://dashboard.quicknode.com/api-keys) and add it to your local `terraform.tfvars`:

```sh
# ./infra/terraform.tfvars
discord_webhook_url = "<discord-webhook-url>"
quicknode_api_key = "<your-quicknode-api-key>"
```

This is necessary for Terraform to be able to create & destroy QuickAlerts as part of `terraform apply`

## First Time Infra Deployment via Terraform
1. Get the VictorOps Webhook URL to your local `terraform.tfvars`. You can get it by going to VictorOps and clicking `Integrations` > `Stackdriver` and copying the URL. The routing key can be founder under the `Settings` tab:

```sh
# ./infra/terraform.tfvars
victorops_webhook_url = "<victorops-webhook-url>/<victorops-routing-key>"
```

1. Auto-generate a local `.env` file by running `npm run generate:env`

## Running and testing the Cloud Function locally

- Make sure you generated a local `.env` file via `npm run generate:env` earlier
- `npm install` (couldn't use `pnpm` because Google Cloud Build failed trying to install pnpm at the time of writing)
- `npm start` to start a local cloud function
- `npm test` to call the local cloud function with a mocked payload, this will send a real Discord message into the channel belonging to the configured Discord Webhook:

```sh
curl -H "Content-Type: application/json" -d @src/proposal-created.fixture.json localhost:8080
```

## Testing the Deployed Cloud Function

You can test the deployed cloud function manually by using the `proposal-created.fixture.json` which contains a similar payload to what a QuickAlert would send to the cloud function:

```sh
./test-deployed-function.sh
# or `npm run test:prod` if you prefer npm to call this script
```

## Updating the Cloud Function

You have two options, using `terraform` or the `gcloud` cli. Both are perfectly fine to use.

1. Via `terraform` by running `npm run deploy:via:tf`
- How? The npm task will:
- Call `terraform apply` which re-deploys the function with the latest code from your local machine
- Pros
- Keeps the terraform state clean
- Same command for all changes, regardless of infra or cloud function code
- Cons
- Less familiar way of deploying cloud functions (if you're used to `gcloud functions deploy`)
- Less log output
- Slightly slower because `terraform apply` will always fetch the current state from the cloud storage bucket before deploying
2. Via `gcloud` by running `npm run deploy:via:gcloud`
- How? The npm task will:
- Look up the service account used by the cloud function
- Call `gcloud functions deploy` with the correct parameters
- Pros
- Familiar way of deploying cloud functions
- More log output making deployment failures slightly faster to debug
- Slightly faster because we're skipping the terraform state lookup
- Cons
- Will lead to inconsistent terraform state (because terraform is tracking the function source code and its version)
- Different commands to remember when updating infra components vs cloud function source code
- Will only work for updating a pre-existing cloud function's code, will fail for a first-time deploy

## Infra Deployment via Terraform

### Google Cloud Permission Requirements

Expand All @@ -176,28 +214,47 @@ In order to create this project from scratch using the [terraform-google-bootstr

### Deployment from scratch

1. Outcomment the `backend` section in `main.tf` (because this bucket doesn't exist yet, it will be created by the first `terraform apply` run)

```hcl
# backend "gcs" {
# bucket = "governance-watchdog-terraform-state-<random-suffix>"
# }
```

1. Run `terraform init` to install the required providers and init a temporary local backend in a `terraform.tfstate` file

<!-- markdown-link-check-disable -->

1. [Create a Discord Webhook URL](https://support.discord.com/hc/en-us/articles/228383668-Intro-to-Webhooks) for the channel you want to receive notifications in <!-- markdown-link-check-enable -->

2. Add the Discord Webhook URL to your local `terraform.tfvars`:
1. Add the Discord Webhook URL to your local `terraform.tfvars`:

```sh
# This will be stored in Google Secret Manager upon deployment via Terraform
echo "discord_webhook_url = \"<discord-webhook-url>"" >> terraform.tfvars
```
3. Outcomment the `backend` section in `main.tf` (because this bucket doesn't exist yet, it will be created by the first `terraform apply` run)
1. Create a Telegram group and invite a new bot into it
```hcl
# backend "gcs" {
# bucket = "governance-watchdog-terraform-state-<random-suffix>"
# }
```
- Open a new telegram chat with @BotFather
- Use the `/newbot` command to create a new bot
- Copy the API key printed out at the end of the prompt and store it in your `terraform.tfvars`
```hcl
telegram_bot_token = "<bot-api-key>"
```
- Get the Chat ID by inviting @MissRose_bot to the group and then using the `/id` command
- Add the Chat ID to your `terraform.tfvars`
```hcl
telegram_chat_id = "<group-chat-id>"
```
4. Run `terraform init` to install the required providers and init a temporary local backend in a `terraform.tfstate` file
- Remove @MissRose_bot after you got the Chat ID
5. **Deploy the entire project via `terraform apply`**
1. **Deploy the entire project via `terraform apply`**
- You will see an overview of all resources to be created. Review them if you like and then type "Yes" to confirm.
- This command can take up to 10 minutes because it does a lot of work creating and configuring all defined Google Cloud Resources
Expand All @@ -206,21 +263,17 @@ In order to create this project from scratch using the [terraform-google-bootstr
**Often a simple retry of `terraform apply` helps**. Sometimes a dependency of a resource has simply not finished creating when terraform already tried to deploy the next one, so waiting a few minutes for things to settle can help.
6. Set your local `gcloud` project to our freshly created one:
1. Set your local `gcloud` project ID to our freshly created one:
```sh
# If that `awk` magic fails, just look up the project ID manually via `gcloud projects list`
project_id=$(terraform state show "module.bootstrap.module.seed_project.module.project-factory.google_project.main" | grep 'project_id' | awk -F '"' '{print $2}')
gcloud config set project $project_id
gcloud auth application-default set-quota-project $project_id
./set-project-id.sh
```
7. Check that everything worked as expected
1. Check that everything worked as expected
```sh
# 1. Call the deployed function via:
npm run test-in-prod # or call the script directly via ./test-deployed-function.sh
npm run test:prod # or call the script directly via ./test-deployed-function.sh
# 2. Monitor the configured Discord channel for a message to appear
open https://discord.com/channels/966739027782955068/1262714272476037212
Expand Down Expand Up @@ -271,34 +324,14 @@ For all team members to be able to manage the Google Cloud infrastructure, you n
rm terraform.tfstate.backup
```
## Updating the Cloud Function
## Debugging Problems
You have two options, using `terraform` or the `gcloud` cli. Both are perfectly fine to use.
### View Logs
1. Via `terraform` by running `npm run deploy:via:tf`
- How? The npm task will:
- Compile TS to JS
- Zip the `./dist` folder into `function-source.zip`
- And then call `terraform apply` which re-deploys the function with the new source code from the zip file
- Pros
- Keeps the terraform state clean
- Same command for all changes, regardless of infra or cloud function code
- Cons
- Less familiar way of deploying cloud functions (if you're used to `gcloud functions deploy`)
- Less log output
- Slightly slower because `terraform apply` will always fetch the current state from the cloud storage bucket before deploying
2. Via `gcloud` by running `npm run deploy:via:gcloud`
- How? The npm task will:
- Generate a temporary `.env.yaml` (because for some reason gcloud does not support normal `.env` files)
- Look up the service account used by the cloud function
- Call `gcloud functions deploy` with the correct parameters
- Pros
- Familiar way of deploying cloud functions
- More log output making deployment failures slightly faster to debug
- Slightly faster because we're skipping the terraform state lookup
- Cons
- Will lead to inconsistent terraform state (because terraform is tracking the function source code and its version)
- Different commands to remember when updating infra components vs cloud function source code
For most problems, you'll likely want to check the cloud function logs first.
- `npm run logs` will print the latest 50 log entries into your local terminal for quick and easy access
- `npm run logs:url` will print the URL to the function logs in the Google Cloud Console for full access
## Teardown
Expand Down
23 changes: 9 additions & 14 deletions deploy-via-gcloud.sh
Original file line number Diff line number Diff line change
@@ -1,19 +1,14 @@
#! /bin/bash
set -e # fail on any error
set -o pipefail # ensure non-zero exit codes are propagated in piped commands
#!/bin/bash
set -e # Fail on any error
set -o pipefail # Ensure piped commands propagate exit codes properly
set -u # Treat unset variables as an error when substituting

entry_point="watchdogNotifier"
function_name="watchdog-notifications"
region="europe-west1"
# Load the project variables
source ./set-project-vars.sh

printf "Looking up function name..."
function_name=$(gcloud functions list --format="value(name)" | grep '^watchdog-notifications')
printf ' \033[1m%s\033[0m\n' "${function_name}"

printf "Looking up project ID..."
project_name="governance-watchdog"
project_id=$(gcloud projects list --filter="name:${project_name}*" --format="value(projectId)")
printf ' \033[1m%s\033[0m\n' "${project_id}"
printf "Looking up entry point..."
entry_point=$(gcloud functions describe "${function_name}" --region="${region}" --format json | jq .buildConfig.entryPoint)
printf ' \033[1m%s\033[0m\n' "${entry_point}"

printf "Looking up service account for function..."
service_account_email=$(gcloud functions describe "${function_name}" --region="${region}" --format="value(serviceConfig.serviceAccountEmail)")
Expand Down
Loading

0 comments on commit 6e7a988

Please sign in to comment.