Skip to content

Commit

Permalink
Merge pull request #123 from skytable/0.8.3/backup-and-restore
Browse files Browse the repository at this point in the history
Add docs on backup and restore, fix inconsistencies
  • Loading branch information
ohsayan authored Jul 5, 2024
2 parents d00fe96 + ebd765d commit e358958
Show file tree
Hide file tree
Showing 17 changed files with 140 additions and 39 deletions.
File renamed without changes.
File renamed without changes.
22 changes: 15 additions & 7 deletions docs/4.architecture.md → docs/c.architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,16 +3,20 @@ id: architecture
title: Architecture
---

Skytable is a modern NoSQL database that prioritises performance, scalability and reliability while providing a rich and powerful querying interface. We are generally targetting an audience that wants to build high performance, large-scale, low latency applications, such as social networking services, auth services, adtech and such. Skytable is designed to work with
both **structured and semi-structured data**.
Skytable is a modern NoSQL database that prioritises performance, scalability and reliability while providing a rich and powerful querying interface.
We are generally targetting an audience that wants to build high performance, large-scale, low latency applications, such as social networking services,
auth services, adtech and such. Skytable is designed to work with both **structured and semi-structured data**.

Our goal is to provide you with a powerful and solid foundation for your application with no gimmicks — just a solid core. That's why, every component in Skytable has been engineered from the ground up, from scratch.
Our goal is to provide you with a powerful and solid foundation for your application with no gimmicks — just a solid core. That's why, every component in
Skytable has been engineered from the ground up, from scratch.

And all of that, without you having to be an expert, and with the least maintenance that you can expect.

## Fundamental differences from relational systems

BlueQL kind of looks and feels like using SQL with a relational database but that doesn't make Skytable's internals the same, with the most important distinction being the fact that Skytable has a NoSQL engine! But Skytable's evaluation and execution of queries is fundamentally different from SQL counterparts and even NoSQL engines. Here are some key differences:
BlueQL kind of looks and feels like using SQL with a relational database but that doesn't make Skytable's internals the same, with the most important
distinction being the fact that Skytable has a NoSQL engine! But Skytable's evaluation and execution of queries is fundamentally different from SQL
counterparts and even NoSQL engines. Here are some key differences:

- All DML queries are point queries and **not** range queries:
- This means that they will either return atleast one row or error
Expand Down Expand Up @@ -64,7 +68,7 @@ A `model` in Skytable is like a `table` in SQL but is vastly different because o

## Query language

Skytable has it's own query language BlueQL<sup>TM</sup> which takes a lot of inspiration from SQL but makes several different (and sometimes vastly different) design choices, focused on clarity, speed, simplicity and most importantly, security.
Skytable has its own query language BlueQL<sup>TM</sup> which takes a lot of inspiration from SQL but makes several different (and sometimes vastly different) design choices, focused on clarity, speed, simplicity and most importantly, security.

For example, Skytable's BlueQL<sup>TM</sup> *only* allows the parameterization of queries. All the queries you ran previously with strings and numbers directly were only possible because the REPL client smartly does the paramterization behind the scenes. This is done for security. You'll learn more about BlueQL next.

Expand Down Expand Up @@ -99,12 +103,15 @@ Skytable will use atleast as many threads as the number of logical CPUs present

## Networking

Skytable its own in-house Skyhash protocol that is built on top of TCP enabling any programming language that has a TCP client to use it without issues. There are three phases in the connection:
Skytable uses its own in-house Skyhash protocol for client-server communication. It is built on top of TCP, enabling any programming language that has a
TCP client to use it without issues. There are three phases in the connection:
- Handshake: All auth data, compatibility information and other data is exchanged at this step
- Connection mode selection: based on the handshake parameters a connection mode is chosen and the server responds with the chosen exchange mode
- Data exchange: This is where the real querying happens
- Termination: there is no special step; just a `TCP FIN`

You can [read more about the protocol here](protocol).

## Backwards compatibility

We make the promise to you that no matter what changes in Skytable, you will always be able to:
Expand All @@ -115,6 +122,7 @@ More technically:
- **For minor/patch releases**: The minor/patch is just in the name but it indicates that no data migration effort is needed. **No minor releases ever need data migration, and any migration is done automatically**
- **For major releases**: Major releases generally introduce breaking changes (just like the upgrade from `0.7.x` to `0.8.0` is a largely breaking change). **Major releases will either automatically upgrade the data files or require you to use a migration tool that is shipped with the bundle**.
- Definitions (closely following semantic versioning):
- **A major release** is something like `1.0.0` to `2.0.0` or `0.8.0` to `0.9.0` (in development versions, 0.8.0 to 0.9.0 is a major version bump)
- **A major release** is something like `1.0.0` to `2.0.0` or `0.8.0` to `0.9.0` (in development versions, 0.8.0 to 0.9.0 is considered a major version
bump)
- **A minor release** is something like `1.0.0` to `1.1.0` or `0.8.0` to `0.8.1`
- **A patch release** is something like `1.0.0` to `1.0.1` or `0.8.0` to `0.8.1` (note that in development versions there is no distinction between a minor and patch release)
File renamed without changes.
10 changes: 6 additions & 4 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,12 @@ To develop using Skytable and maintain your deployment you will want to learn ab
- [**DCL**](blueql/dcl): Data control with BlueQL
- [**Querying**](querying): Introduces different query modes and when to choose a specific query mode
- [**System administration**](system):
- [**Configuration**](system/configuration): Information to help you configure Skytable with custom settings such as custom ports, hosts, TLS, and etc.
- [**User management**](system/user-management): Information on access control, user and other administration features
- [**Global management**](system/global-management): Global settings management
- [**Operations**](system/operations): Learn about administration operations
- [**Configuration**](system/configuration): Configuration modes (CLI, environment variables, configuration files) and options
- [**User management**](system/user-management): Account types, permissions, creating and managing multiple users
- [**Global management**](system/global-management): Learn how to check system health and manage the global state of your database instances
- [**Disk usage**](system/disk-usage): Understand disk usage and compaction
- [**Backup and restore**](system/backup-and-restore): Backing up data and restoring data from backups
- [**Data recovery**](system/recovery): Understanding data loss, mitigation and recovery options
- **Resources**:
- [**Useful links**](resources/useful-links): Links to helpful resources
- [**Migration**](resources/migration): For old our returning Skytable users who are coming from older versions
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ To start the server with a configuration file, simply run `skyd --config <path t
Here's an explanation of all the keys:
- `system`:
- `mode`: set to either `dev` / `prod` mode. `prod` mode will generally make some things stricters (such as background services)
- `rs_window`: **This is a very important setting!** It is set to `300` by default and is called the "reliability service window" which ensures that if any changes are observed in `300` (or whatever value you set) seconds, then they reach the disk as soon as that time elapses. For example, in the default configuration the system checks for changes every 5 minutes and if there are any dataset changes, they are immediately synced. [Read more here](operations#understanding-data-loss)
- `rs_window`: **This is a very important setting!** It is set to `300` by default and is called the "reliability service window" which ensures that if any changes are observed in `300` (or whatever value you set) seconds, then they reach the disk as soon as that time elapses. For example, in the default configuration the system checks for changes every 5 minutes and if there are any dataset changes, they are immediately synced. [Read more here](recovery#understanding-data-loss)
- `auth`:
- `plugin`: this is the authentication plugin. we currently only have `pwd` that is a simple password based authentication system where the password is stored as an [`rcrypt` hash](https://github.com/ohsayan/rcrypt) on disk. More `plugin` options are set to be implemented for more advanced authentication, especially in enterprise settings
- `root_pass`: this is the root account password. **It must have atleast 16 characters**
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,23 @@ The following query returns an `Empty` response or an error code depending on th
SYSCTL REPORT STATUS
```

If you receive an error code, we recommend you to connect to the host and check logs. If the server has crashed, you may need to [recover the database](operations#data-recovery).
If you receive an error code, we recommend you to connect to the host and check logs. If the server has crashed, you may need to [recover the database](recovery).

## Inspecting all spaces
## Inspecting global state

The following query provides a quick overview of the global system state, including users, spaces and settings:

```sql
INSPECT GLOBAL
```

This will return a JSON like this:

The single DDL query that lets you do a "sneak peek" into the status of the entire system is the `INSPECT GLOBAL` query. It
returns a JSON string like this:
```json
{
"spaces:"["space1", "space2"],
"users":["root", "staging_server"],
"settings:{},
"spaces": ["prodApp1", "prodApp2"],
"users": ["root", "staging_app_server", "prod_app_server"],
"settings": {}
}
```

Expand Down
31 changes: 31 additions & 0 deletions docs/system/d.disk-usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
---
id: disk-usage
title: Disk usage
---

## Directory structure

This is the general directory structure (subdirectories omitted):
```
├── data
├── gns.db-tlog
└── .sky_pid
```

- `gns.db-tlog` (file): This is a very important file that stores system tables and other data
- `data` (directory): This directory contains subdirectories with all the spaces (which in turn contain all the data for each space)
- `.sky_pid` (file): This is a temporary PID file that is created whenever the database is started. If the database crashes, then you may have to remove
it manually


## Managing disk usage

Over time, as you continue to use your database your database files will grow in size, as you would expect. However, sometimes database files may grow beyond an efficient size resulting in high memory usage or slowdowns. To counter this, Skytable uses internal heuristics to determine when a database file is "larger than needed" and automatically compacts them at startup.

However, in some cases you may wish to perform a compaction regardless in order to reduce the file size. In order to do this you will have to run:

```sh
skyd compact
```

The server will then compact all files (even if a compaction wasn't triggered by internal heuristics) to their optimum size.
62 changes: 62 additions & 0 deletions docs/system/e.backup-and-restore.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
id: backup-and-restore
title: Backup and restore
---

## Backing up data

To back data up, you can use the subcommand `skyd backup` as follows:

```sh
skyd backup --type=direct --to=<path to backup> [--from <directory>]
```

- `--type=direct`: This specifies the kind of backup created. The `direct` type indicates that it's a simple copy of the data files and directories
- `--to=<path to backup>`: This specifies where this backup is to be created
- `--from <path to installation>` *(optional)*: When this is not provided, the `backup` subcommand assumes that the current working directory is the installation directory. If you're running it from a different directory then set this option.

**Example**:

```sh
skyd backup \
--type=direct \
--from=/var/lib/skytable \
--to=/mnt/backupnfsdrive/quick-backup-before-migration
```

:::info Backup types
Note that in the future we may add more backup types including compressed archives or other modes. The only type of backup (specified using `--type`) is `direct` which clones the data files and directories. But you do not need to worry about this as the restore subcommand will take care of determining what kind of backup is being pointed to.
:::

### Backup protections

The `backup` subcommand includes some protections to create consistent and valid backups. These include not allowing backups if the database is currently using the data files and some other parameters. If you need to override any of these parameters, then please check the help menu with `skyd backup --help`.

## Restoring data

To restore data from a backup, you can use the subcommand `skyd restore` as follows:

```sh
skyd restore --from=<path to backup> [--to <installation directory>]
```

- `--from=<path to backup>`: Specifies the path to the backup
- `--to <installation directory>` *(optional)*: By default, it is assumed that the current directory is the installation directory. If not, set this option.

**Example**:

```sh
skyd restore \
--from=/mnt/backupnfsdrive/quick-backup-before-migration \
--to=/var/lib/skytable
```

### Data restore protections

The `restore` subcommand also has some safeguards in place that prevent you from accidentally restoring incorrect data. Some of these safeguards include:

- **Backup has correct time signatures**
- **Backup is compatible**
- **Was created by the same host:** you will obviously need to override this when recovering from a crash and this should be okay to do. The reason this protection exists is in a situation where you're running a cluster and have multiple backups and accidentally restore from the wrong backup.

If you need to override any of these conditions in special cases, then please check the help menu with `skyd restore --help`.
16 changes: 2 additions & 14 deletions docs/system/operations.md → docs/system/f.recovery.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,8 @@
---
title: Operations
id: recovery
title: Data recovery
---

## Managing disk usage

Over time, as you continue to use your database your database files will grow in size, as you would expect. However, sometimes database files may grow beyond an efficient size resulting in high memory usage or slowdowns. To counter this, Skytable uses internal heuristics to determine when a database file is "larger than needed" and automatically compacts them at startup.

However, in some cases you may wish to perform a compaction regardless in order to reduce the file size. In order to do this you will have to run:

```sh
skyd compact
```

The server will then compact all files (even if a compaction wasn't triggered by internal heuristics) to their optimum size.

## Data recovery

In the unforeseen event that a power failure or other catastrophic system failure causes the database to crash, the Skytable server will fail to start normally. Usually it will exit with a nonzero code and an error message such as "journal-corrupted." In such cases, you will need to recover the journal(s) and/or any other corrupted file(s).

Expand Down
8 changes: 5 additions & 3 deletions docs/system/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,9 @@ In the following sections, we explore general system administration options with

Here's an overview of the different administration guides:

- [**Configuration**](configuration): Understand how Skytable can be configured using command-line arguments, environment variables or a configuration file and what all configuration options are available
- [**User management**](user-management): Learn about account types, permissions and how you can manage multiple users
- [**Configuration**](configuration): Configuration modes (CLI, environment variables, configuration files) and options
- [**User management**](user-management): Account types, permissions, creating and managing multiple users
- [**Global management**](global-management): Learn how to check system health and manage the global state of your database instances
- [**Operations**](operations): Understand administrator operations tasks such as backups, recovery and more
- [**Disk usage**](disk-usage): Understand disk usage and compaction
- [**Backup and restore**](backup-and-restore): Backing up data and restoring data from backups
- [**Data recovery**](recovery): Understanding data loss, mitigation and recovery options
4 changes: 2 additions & 2 deletions docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -163,8 +163,8 @@ module.exports = {
to: '/protocol/specification'
},
{
from: '/system/recovery',
to: '/system/operations#data-recovery'
from: '/system/operations',
to: '/system',
}
]
}]
Expand Down
4 changes: 3 additions & 1 deletion sidebars.ts
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,9 @@ module.exports = {
"system/configuration",
"system/user-management",
"system/global-management",
"system/operations",
"system/disk-usage",
"system/backup-and-restore",
"system/recovery",
],
link: {
type: 'doc',
Expand Down

0 comments on commit e358958

Please sign in to comment.