Skip to content

Commit

Permalink
Merge pull request #8 from converged-computing/add-global-token
Browse files Browse the repository at this point in the history
add support for global token and kind example
  • Loading branch information
vsoch authored Feb 15, 2024
2 parents e09b003 + abe625e commit a638662
Show file tree
Hide file tree
Showing 16 changed files with 644 additions and 29 deletions.
2 changes: 2 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
rainbow.db
env
15 changes: 11 additions & 4 deletions cmd/server/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,34 +10,41 @@ import (
)

var (
address string
host string
name = "rainbow"
sqliteFile = "rainbow.db"
environment = "development"
cleanup = false
secret = "chocolate-cookies"
globalToken = ""
)

func main() {
flag.StringVar(&address, "address", ":50051", "Server address (host:port)")
flag.StringVar(&host, "host", ":50051", "Server address (host:port)")
flag.StringVar(&name, "name", name, "Server name (default: rainbow)")
flag.StringVar(&sqliteFile, "db", sqliteFile, "sqlite3 database file (default: rainbow.db)")
flag.StringVar(&globalToken, "global-token", name, "global token for cluster access (not recommended)")
flag.StringVar(&secret, "secret", secret, "secret to validate registration (default: chocolate-cookies)")
flag.StringVar(&environment, "environment", environment, "environment (default: development)")
flag.BoolVar(&cleanup, "cleanup", cleanup, "cleanup previous sqlite database (default: false)")
flag.Parse()

// create server
log.Print("creating 🌈️ server...")
s, err := server.NewServer(name, types.Version, environment, sqliteFile, cleanup, secret)
s, err := server.NewServer(name, types.Version, environment, sqliteFile, cleanup, secret, globalToken)
if err != nil {
log.Fatalf("error while creating server: %v", err)
}
defer s.Stop()

// Give a warning if the globalToken is set
if globalToken != "" {
log.Printf("⚠️ WARNING: global-token is set, use with caution.")
}

// run server
log.Printf("starting scheduler server: %s", s.String())
if err := s.Start(context.Background(), address); err != nil {
if err := s.Start(context.Background(), host); err != nil {
log.Fatalf("error while running scheduler server: %v", err)
}
log.Printf("🌈️ done 🌈️")
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
Examples coming soon:

- [docker-compose](docker-compose): Simple setup to run a scheduler and two clusters with local docker images
- Kubernetes in Docker (kind)
- [kind](kind): Kubernetes in Docker (kind)
2 changes: 1 addition & 1 deletion docs/examples/docker-compose/docker-compose-demo.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ services:
container_name: scheduler
image: ghcr.io/converged-computing/rainbow-scheduler:latest
entrypoint: rainbow-scheduler
command: --address :8080 --name rainbow --secret peanutbuttajellay
command: --host :8080 --name rainbow --secret peanutbuttajellay
volumes:
- ./data:/data
ports:
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/docker-compose/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ services:
container_name: scheduler
image: ghcr.io/converged-computing/rainbow-scheduler:latest
entrypoint: rainbow-scheduler
command: --address :8080 --name rainbow --secret peanutbuttajellay
command: --host :8080 --name rainbow --secret peanutbuttajellay
volumes:
- ./data:/data
ports:
Expand Down
198 changes: 198 additions & 0 deletions docs/examples/kind/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
# Kubernetes in Docker (kind)

This example shows using the [rainbow docker images](https://github.com/orgs/converged-computing/packages?repo_name=rainbow) locally via kind, which is Kubernetes in docker. These same manifests (the YAML files) will likely work in production Kubernetes as well.

## Usage

## 1. Create Cluster

This cluster is going to allow us to create ingress.

```bash
kind create cluster --config ./kind-config.yaml
```

## 2. Load Images

This step is optional, but if you want to load your images in before creating Kubernetes objects, you can.
This mostly helps if you have a local image (otherwise it will pull from the remote registry directory)

```bash
kind load docker-image ghcr.io/converged-computing/rainbow-flux:latest
kind load docker-image ghcr.io/converged-computing/rainbow-scheduler:latest
```

## 3. Create Service and Ingress

Let's next create the service. While we don't technically need this (communication happens within the network of pods) we anticipate some case when we will want to interact from outside of that space and thus show you how to set it up.

```bash
kubectl create -f ./service.yaml
kubectl create -f ./ingress.yaml
```

## 4. Create Rainbow Scheduler

Let's create the rainbow scheduler deployment.

```bash
kubectl create -f ./scheduler.yaml
```

And ensure it is running OK:

```bash
kubectl logs scheduler-798ddccf-pxfxx
```
```console
2024/02/14 20:53:37 creating 🌈️ server...
2024/02/14 20:53:37 ✨️ creating rainbow.db...
2024/02/14 20:53:37 rainbow.db file created
2024/02/14 20:53:37 create cluster table...
2024/02/14 20:53:37 cluster table created
2024/02/14 20:53:37 create jobs table...
2024/02/14 20:53:37 jobs table created
2024/02/14 20:53:37 ⚠️ WARNING: global-token is set, use with caution.
2024/02/14 20:53:37 starting scheduler server: rainbow v0.1.0-draft
2024/02/14 20:53:37 server listening: [::]:8080
```

Importantly, we give the scheduler a predictable hostname.

```bash
kubectl exec -it scheduler-798ddccf-pxfxx -- cat /etc/hosts
```
```console
# Kubernetes-managed hosts file.
127.0.0.1 localhost
::1 localhost ip6-localhost ip6-loopback
fe00::0 ip6-localnet
fe00::0 ip6-mcastprefix
fe00::1 ip6-allnodes
fe00::2 ip6-allrouters
10.244.0.5 scheduler.rainbow.default.svc.cluster.local scheduler
```

## 5. Start Clusters

We are going to cheat a bit and create multiple "clusters" (pods) via an indexed job.
The names of the pods (hostnames) will correspond with the names of the clusters. We also
are using a `--global-token` so a shared filesystem is not needed - the scheduler will assign
the same token to all newly registered clusters. This is of course not intended for a production
setup.

```bash
kubectl apply -f ./clusters.yaml
```

Four pods should be running now! You can watch the scheduler logs, and ultimately see logs of a cluster to see what is happening.

```console
👋️ Hello, I'm clusters-0!
📜️ Registering clusters-0...
token: "jellaytime"
secret: "e2342db8-1c0b-40dd-9b42-e49e56480916"
status: REGISTER_SUCCESS

🥳️ All 3 clusters are registered.
status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

Status: REQUEST_JOBS_SUCCESS
Received 2 jobs for inspection!
Accepting 2 jobs...
[5, 1]
Submit job ['echo', 'hello', 'from', 'clusters-1,', 'a', 'new', 'word', 'is', 'perfume']: ƒCYGUnZd
Submit job ['echo', 'hello', 'from', 'clusters-2,', 'a', 'new', 'word', 'is', 'sn']: ƒCYraW3Z
Ran job ƒCYGUnZd hello from clusters-1, a new word is perfume

Ran job ƒCYraW3Z hello from clusters-2, a new word is sn

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

Status: REQUEST_JOBS_SUCCESS
Received 2 jobs for inspection!
Accepting 2 jobs...
[16, 14]
Submit job ['echo', 'hello', 'from', 'clusters-2,', 'a', 'new', 'word', 'is', 'world']: ƒH7VNJbZ
Submit job ['echo', 'hello', 'from', 'clusters-0,', 'a', 'new', 'word', 'is', 'essex']: ƒH86x1Mq
Ran job ƒH86x1Mq hello from clusters-0, a new word is essex

Ran job ƒH7VNJbZ hello from clusters-2, a new word is world

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

Status: REQUEST_JOBS_SUCCESS
Received 5 jobs for inspection!
Accepting 5 jobs...
[27, 23, 20, 29, 22]
Submit job ['echo', 'hello', 'from', 'clusters-0,', 'a', 'new', 'word', 'is', 'resorts']: ƒMetzhd1
Submit job ['echo', 'hello', 'from', 'clusters-1,', 'a', 'new', 'word', 'is', 'puzzles']: ƒMfY4Pfd
Submit job ['echo', 'hello', 'from', 'clusters-1,', 'a', 'new', 'word', 'is', 'mean']: ƒMg9e6Ru
Submit job ['echo', 'hello', 'from', 'clusters-0,', 'a', 'new', 'word', 'is', 'jamie']: ƒMgnhnUX
Submit job ['echo', 'hello', 'from', 'clusters-1,', 'a', 'new', 'word', 'is', 'final']: ƒMhNoVxT
Ran job ƒMhNoVxT hello from clusters-1, a new word is final

Ran job ƒMg9e6Ru hello from clusters-1, a new word is mean

Ran job ƒMgnhnUX hello from clusters-0, a new word is jamie

Ran job ƒMfY4Pfd hello from clusters-1, a new word is puzzles

Ran job ƒMetzhd1 hello from clusters-0, a new word is resorts

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

Status: REQUEST_JOBS_SUCCESS
Received 2 jobs for inspection!
Accepting 2 jobs...
[33, 35]
Submit job ['echo', 'hello', 'from', 'clusters-1,', 'a', 'new', 'word', 'is', 'goal']: ƒSDXnWB1
Submit job ['echo', 'hello', 'from', 'clusters-1,', 'a', 'new', 'word', 'is', 'during']: ƒSE9NCwH
Ran job ƒSE9NCwH hello from clusters-1, a new word is during

Ran job ƒSDXnWB1 hello from clusters-1, a new word is goal

status: SUBMIT_SUCCESS

status: SUBMIT_SUCCESS

Status: REQUEST_JOBS_SUCCESS
Received 1 jobs for inspection!
Accepting 1 jobs...
[41]
Submit job ['echo', 'hello', 'from', 'clusters-2,', 'a', 'new', 'word', 'is', 'manufacture']: ƒWjQTeHm
Ran job ƒWjQTeHm hello from clusters-2, a new word is manufacture

💤️ Cluster clusters-0 is finished! Shutting down.
```


And that's it! Clean up when you are done:

```bash
kind delete cluster
```
24 changes: 24 additions & 0 deletions docs/examples/kind/clusters.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
apiVersion: batch/v1
kind: Job
metadata:
name: clusters
spec:
completions: 3
parallelism: 3
completionMode: Indexed
template:
spec:
restartPolicy: Never
containers:
- name: cluster
image: ghcr.io/converged-computing/rainbow-flux:latest
command: ["flux"]

# Note that --host defaults to scheduler.rainbow.default.svc.cluster.local:8080
args: ["start", "python3",
"/code/docs/examples/kind/scripts/run-demo.py",
"--peer", "clusters-0",
"--peer", "clusters-1",
"--peer", "clusters-2",
"--iters", "5"]
imagePullPolicy: Never
17 changes: 17 additions & 0 deletions docs/examples/kind/ingress.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rainbow
spec:
rules:
- host: localhost
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: rainbow
port:
# TODO look at what wfmanager is doing with mlserver
number: 8080
20 changes: 20 additions & 0 deletions docs/examples/kind/kind-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
kubeadmConfigPatches:
- |
kind: InitConfiguration
nodeRegistration:
kubeletExtraArgs:
node-labels: "ingress-ready=true"
extraPortMappings:
- containerPort: 8080
hostPort: 8080
protocol: TCP
- containerPort: 80
hostPort: 80
protocol: TCP
- containerPort: 443
hostPort: 443
protocol: TCP
26 changes: 26 additions & 0 deletions docs/examples/kind/scheduler.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: scheduler
spec:
selector:
matchLabels:
app: rainbow
replicas: 1
template:
metadata:
labels:
# Matches the headless service
app: rainbow
spec:
subdomain: rainbow
hostname: scheduler
containers:
- name: scheduler
image: ghcr.io/converged-computing/rainbow-scheduler:latest

# Note that we are setting a global token (not recommended)! So that
# we don't need a shared filesystem.
command: ["rainbow-scheduler"]
args: ["--host", ":8080", "--name", "rainbow", "--secret", "peanutbutta", "--global-token", "jellaytime"]
imagePullPolicy: Never
Loading

0 comments on commit a638662

Please sign in to comment.