diff --git a/Cargo.lock b/Cargo.lock index a05bf96..7057f43 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -1836,7 +1836,7 @@ dependencies = [ [[package]] name = "river" -version = "0.2.1" +version = "0.5.0" dependencies = [ "async-trait", "cidr", diff --git a/README.md b/README.md index 5d68c5a..1f3b51c 100644 --- a/README.md +++ b/README.md @@ -4,32 +4,10 @@ ## Current State -We reached the [initial v0.2.0 release] at the end of April (and a small [v0.2.1 release] -for crates.io availability in May), completing the work in [Kickstart Spike 1]. +River is currently v0.5.0. See the [v0.5.0 release notes] for more details on recently +added features. -As of the end of May, work towards the next features in [Kickstart Spike 2] has begun. - -The next work is focused on: - -1. Development of "multiple upstream" features, including: - * Supporting Load Balancing of upstream servers - * Supporting Health Checks of upstream servers - * Supporting Service Discovery of upstream servers -2. Developer and Operator Quality of Life features, including: - * Supporting basic static HTML file serving - * Supporting semi-dynamic observability endpoints, e.g. for Prometheus polling - * Support for hot-reloading of configuration - * CI for build and test checks on pull requests -3. Development of initial Robustness features, including: - * Rate limiting of connections and/or requests - * CIDR/API range-based filtering for rejecting connections - -Stay tuned for updates on these features! - -[initial v0.2.0 release]: https://github.com/memorysafety/river/releases/tag/v0.2.0 -[v0.2.1 release]: https://github.com/memorysafety/river/releases/tag/v0.2.1 -[Kickstart Spike 1]: https://github.com/memorysafety/river/milestone/1 -[Kickstart Spike 2]: https://github.com/memorysafety/river/milestone/3 +[v0.5.0 release notes]: https://github.com/memorysafety/river/blob/main/docs/release-notes/2024-08-30-v0.5.0.md **Until further notice, there is no expectation of stability.** @@ -39,11 +17,11 @@ At the moment, `river` can be invoked from the command line. See `--help` for all options. Configuration is currently done exclusively via configuration file. See -[`test-config.toml`] for an example configuration file. Additionally, see -[`toml-configuration.md`] for more configuration details. +[`test-config.kdl`] for an example configuration file. Additionally, see +[kdl configuration] for more configuration details. -[`test-config.toml`]: ./source/river/assets/test-config.toml -[`toml-configuration.md`]: ./docs/toml-configuration.md +[`test-config.kdl`]: ./source/river/assets/test-config.kdl +[kdl configuration]: https://onevariable.com/river-user-manual/config/kdl.html ## License diff --git a/docs/release-notes/2024-08-30-v0.5.0.md b/docs/release-notes/2024-08-30-v0.5.0.md new file mode 100644 index 0000000..b314a00 --- /dev/null +++ b/docs/release-notes/2024-08-30-v0.5.0.md @@ -0,0 +1,448 @@ +# River v0.5.0 + +River is a reverse proxy application written in Rust, supported as a [Prossimo Initiative], +built on top of the [Pingora engine] from Cloudflare. River began development in Q1 2024, +and is currently in an early developer preview state. + +[Prossimo Initiative]: https://www.memorysafety.org/initiative/reverse-proxy/ +[Pingora engine]: https://github.com/cloudflare/pingora + +This announcement covers the v0.5.0 release of River, the second development release made +public. This release was focused on initial development of core functionality necessary +for evaluation of River for early experimental users. This release was led by +[OneVariable UG], and sponsored by [Prossimo][Prossimo Initiative]. + +[OneVariable UG]: https://onevariable.com/ + +For installation instructions, see [the release details on GitHub]. This release includes +binaries for x86_64 Linux (GNU and MUSL builds), as well as a release for aarch64 MacOS +(M-series Macs). Binaries are provided, as well as an installation script. Release tooling +is provided by [`cargo-dist`]. + +[the release details on GitHub]: https://github.com/memorysafety/river/releases/tag/v0.5.0 +[`cargo-dist`]: https://github.com/axodotdev/cargo-dist + +## Note on versioning + +Initially, it was intended to ship three releases for this spike of work, according to +our [published roadmap]. However, as development of these features didn't end up being +linear, we instead skipped releases v0.3.0 and v0.4.0 in order to ship these all of the +included features in v0.5.0. + +Release v0.5.0 is being made directly after v0.2.x releases to better communicate the +level of readiness and to remain consistent with the roadmap. + +[published roadmap]: https://github.com/memorysafety/river/blob/main/docs/roadmap.md + +## Notable features + +The following are the notable features included in the v0.5.0 release. For a full list of +changes, please refer to [the release details on GitHub]. + +This second milestone was about extending core functionality of River, to move it closer +to a state where users may begin experimentation and evaluation. + +Primary objectives included: + +1. Development of "multiple upstream" features, including: + * Supporting Load Balancing of upstream servers + * Supporting Health Checks of upstream servers + * Supporting Service Discovery of upstream servers +2. Developer and Operator Quality of Life features, including: + * Supporting basic static HTML file serving + * Supporting semi-dynamic observability endpoints, e.g. for Prometheus polling + * Support for hot-reloading of configuration + * CI for build and test checks on pull requests +3. Development of initial Robustness features, including: + * Rate limiting of connections and/or requests + * CIDR/API range-based filtering for rejecting connections + +### Development of "multiple upstream" features + +The v0.2.0 release shipped with support for only a single upstream. This meant that it was +not possible to perform "load balancing" responsibilities, proxying downstream requests to +multiple upstream servers. + +The v0.5.0 release has added numerous features that now allow for the ability to support +multiple upstreams for a single service, laying the foundation necessary for properly +supporting this use case. + +Many of these features are based on capabilities from the [`pingora_load_balancing`] crate. + +[`pingora_load_balancing`]: https://docs.rs/pingora-load-balancing/latest/pingora_load_balancing/ + +#### Supporting Load Balancing of upstream servers + +River now supports a number of strategies for selecting an upstream server when multiple are +provided. These currently include: + +* **Round Robin** - the upstream servers are selected on a rotating basis, proxying an even number + of requests to each upstream server +* **Random** - An upstream server is selected at random, proxying a statistically even number of + requests to each upstream server (over a long enough operational time) +* **FNV Hashing** - FNV hashing uses the [Fowler-Noll-Vo Hash Function], a simple and fast + non-cryptographic hash on a component of the request, to select an upstream server +* **Ketama Hashing** - Ketama hashing is a consistent hash used in other services such as NGINX + and `memcached`. It is intended to reduce cache misses when upstream servers are added or + removed + +[Fowler-Noll-Vo Hash Function]: https://en.wikipedia.org/wiki/Fowler%E2%80%93Noll%E2%80%93Vo_hash_function + +For the hashing based load balancing strategies, two data sources are currently supported: + +* The URI path of the request +* The Source IP address and URI path of the request, combined + +In the future, River intends to support additional Load Balancing Strategies, and additional +data sources for hashing, allowing for greater user configuration and flexibility. Please feel +free to [open an issue] if you have requests or suggestions for strategies or data sources +you would like to see implemented! + +[open an issue]: https://github.com/memorysafety/river/issues/new/choose + +#### Supporting Service Discovery of upstream servers + +For deployments where upstream servers may come and go, for example when scaling in response +to load, it is necessary to support dynamically discovering what servers are available. + +Infrastructure from [`pingora_load_balancing`] was introduced during this spike, in order to +support this in the future. At the moment, only the "static" discovery strategy is supported, +which enabled the ability to specify multiple servers at configuration and startup time. + +In the future, it is planned to support dynamically discovered servers, using techniques such +as [DNS Service Discovery]. Please feel free to [open an issue] if you have requests or +suggestions for service discovery strategies you would like to see implemented! + +Full implementation for Service Discovery is currently planned on our [published roadmap] for +the v0.7.x release. + +[DNS Service Discovery]: http://www.dns-sd.org/ + +#### Supporting Health Checks of upstream servers + +As part of Service Discovery, it is necessary to monitor whether the current list of upstream +servers is capable of accepting proxied requests. The infrastructure for supporting these +capabilities were also introduced in this release, provided by the [`pingora_load_balancing`] +crate. + +Health Checks include active observation, ensuring that we can continue to establish connections +with potential upstream servers, as well as querying health check specific endpoints to allow +upstream servers to report their current status. + +Full implementation for Health Checks is currently planned on our [published roadmap] for +the v0.7.x release. + +### Developer and Operator Quality of Life features + +Effort was spent in the v0.5.x release to make River more pleasant to develop and use for +operators of River. These features are outside the typical proxying responsibilities of +River, but assist in the usage and operation of River as an application. + +#### Supporting basic static HTML file serving + +Many reverse proxy applications also allow for serving of static files, such as images, +HTML files, CSS files, or other non-dynamic content. Although River does not currently +consider this a "core competency", it is a common expectation of users of reverse proxies +such as NGINX. + +In v0.5.0, River introduced the ability to serve local static files, which was made possible +by integrating components from the [Pandora Web Server] project, developed by +[Wladimir Palant (@palant)]. + +Users may select a base filesystem path, where all contents of the path will be available +when requested. + +[Pandora Web Server]: https://github.com/pandora-web-server/pandora-web-server +[Wladimir Palant (@palant)]: https://github.com/palant + +#### Support for hot-reloading of configuration + +Outside of the planned support for Service Discovery, River has generally made the choice +that configuration should be static and not modifyable at runtime. However, it is often +necessary to change configuration of a reverse proxy, without observable downtime from +the perspective of downstream clients. + +To support this, the v0.5.0 release of River has added the ability to hot reload River, +meaning that a second instance of the application can be launched, which can take over +existing services and listening sockets, meaning that after the switchover begins, all +new connections will be served by the new instance, and existing connections will be +given a grace period to complete before the connection is terminated. + +For more information on how to perform a hot reload, please refer to the [Hot Reloading Section] +of the [River User Manual]. + +[Hot Reloading Section]: https://onevariable.com/river-user-manual/reloading.html +[River User Manual]: https://onevariable.com/river-user-manual/ + +#### (Unplanned) Adoption of the `KDL` language for configuration + +As this release introduced a number of new configuration options, it became apparent that +specifying deeply structured configuration became overly cumbersome in the existing TOML +configuration file format. + +This implementation experimented with, and has now committed to supporting the +[KDL Document Language] as the primary configuration language for River. Although KDL is +not currently widely used, it allows for very expressive and intuitive structuring of data, +and has a structure that is familiar to other configuration formats such as NGINX's +configuration files. + +[KDL Document Language]: https://kdl.dev/ + +##### Comparison of KDL and TOML + +As a direct comparison, here is an example of the previous TOML file format: + +```toml +[system] +threads-per-service = 8 + +[[basic-proxy]] +name = "Example1" + [[basic-proxy.listeners]] + [basic-proxy.listeners.source] + kind = "Tcp" + [basic-proxy.listeners.source.value] + addr = "0.0.0.0:8080" + + [[basic-proxy.listeners]] + [basic-proxy.listeners.source] + kind = "Tcp" + [basic-proxy.listeners.source.value] + addr = "0.0.0.0:4443" + [basic-proxy.listeners.source.value.tls] + cert_path = "./assets/test.crt" + key_path = "./assets/test.key" + + [basic-proxy.connector] + proxy_addr = "91.107.223.4:443" + tls_sni = "onevariable.com" + + [basic-proxy.path-control] + upstream-request-filters = [ + { kind = "remove-header-key-regex", pattern = ".*(secret|SECRET).*" }, + { kind = "upsert-header", key = "x-proxy-friend", value = "river" }, + ] + + upstream-response-filters = [ + { kind = "remove-header-key-regex", pattern = ".*ETag.*" }, + { kind = "upsert-header", key = "x-with-love-from", value = "river" }, + ] + +[[basic-proxy]] +name = "Example2" +listeners = [ + { source = { kind = "Tcp", value = { addr = "0.0.0.0:8000" } } } +] +connector = { proxy_addr = "91.107.223.4:80" } +``` + +Whereas in KDL, it looks like this: + +```kdl +system { + threads-per-service 8 +} + +services { + Example1 { + listeners { + "0.0.0.0:8080" + "0.0.0.0:4443" cert-path="./assets/test.crt" key-path="./assets/test.key" + "0.0.0.0:8443" cert-path="./assets/test.crt" key-path="./assets/test.key" + } + + connectors { + "91.107.223.4:443" tls-sni="onevariable.com" + } + + path-control { + upstream-request { + filter kind="remove-header-key-regex" pattern=".*SECRET.*" + filter kind="remove-header-key-regex" pattern=".*secret.*" + filter kind="upsert-header" key="x-proxy-friend" value="river" + } + upstream-response { + filter kind="remove-header-key-regex" pattern=".*ETag.*" + filter kind="upsert-header" key="x-with-love-from" value="river" + } + } + } + + Example2 { + listeners { + "0.0.0.0:8000" + } + connectors { + "91.107.223.4:80" + } + } +} +``` + +##### Diagnostics + +Additionally, work as been made to enable helpful diagnostics for the KDL file support, allowing for +helpful feedback when configuration mistakes are made. For example, if a user mis-spells `health-check` as +`health-cheque`, they will see an error that looks like this: + +``` +Error rendering config from KDL file: × Incorrect configuration contents + ╭─[83:1] + 83 │ discovery "Static" + 84 │ health-cheque "None" + · ──────────┬───────── + · ╰── incorrect + 85 │ } + ╰──── + help: Unknown setting: 'health-cheque' +``` + +##### Planned Deprecation of TOML support + +At the moment, River will continue to support both TOML and KDL configuration, however many new features +included in this release cannot be specified in the TOML format. + +It is planned to deprecate TOML configuration file support in 0.6.x releases, leaving KDL as the only +"human oriented" configuration language, though we may support other formats such as JSON in the future, +to support machine-generated configuration inputs. + +#### (Unplanned) Initial version of the River User Manual + +When sharing River development progress with people for feedback, it was realized that River was missing +documentation oriented towards users and operators of River, rather than developers of River. + +Documentation was added as part of the v0.5.0 release, published as the [River User Manual]. This +includes specification of core concepts, instructions for operational concerns such as hot reloading, +and a specification of configuration file formats. + +The [River User Manual] is temporarily hosted on the OneVariable domain, but will likely move to a +more permanent home in the future. + +#### (Delayed) Supporting semi-dynamic observability endpoints, e.g. for Prometheus polling + +It was planned to support semi-dynamic pages for performance counters and logs in this release. + +This feature was delayed until a further release. + +#### CI for build and test checks on pull requests + +Basic CI testing was added as part of this release. Formatting, build checks, unit tests, and runtime validation +of example configuration files are now all tested on every pull request and commit to the `main` branch. + +### Development of initial Robustness features + +The 0.5.0 release also added support for "robustness" features, typically around refusal of unwanted requests. + +These features are necessary when operating in adverse environments. + +#### Rate limiting of connections and/or requests + +This release introduced the concept of rate limiting based on configurable settings, as well as +selectable rules. + +River now supports configuration of rate limiting, as in the following example: + +```kdl +// Apply Rate limiting to this service +// +// Note that ALL rules are applied, and a request must receive a token from all +// applicable rules. +// +// For example: +// +// A request to URI `/index.html` from IP 1.2.3.4 will only need to get a token from +// the `source-ip` rule. +// +// A request to URI `/static/style.css` from IP 1.2.3.4 will need to get a token from +// BOTH the `source-ip` rule (from the `1.2.3.4` bucket), AND the `specific-uri` rule +// (from the `/static/style.css` bucket) +rate-limiting { + // This rate limiting rule is based on the source IP address + // + // * Up to the last 4000 IP addresses will be remembered + // * Each IP address can make a burst of 10 requests + // * The bucket for each IP will refill at a rate of 1 request per 10 milliseconds + rule kind="source-ip" \ + max-buckets=4000 tokens-per-bucket=10 refill-qty=1 refill-rate-ms=10 + + // This rate limiting is based on the specific URI path + // + // * Up to the last 2000 URI paths will be remembered + // * Each URI path can make a burst of 20 requests + // * The bucket for each URI will refill at a rate of 5 requests per 1 millisecond + rule kind="specific-uri" pattern="static/.*" \ + max-buckets=2000 tokens-per-bucket=20 refill-qty=5 refill-rate-ms=1 + + // This rate limiting is based on ANY URI paths that match the pattern + // + // * A single bucket will be used for all URIs that match the pattern + // * We allow a burst of up to 50 requests for any MP4 files + // * The bucket for all MP4 files will refill at a rate of 2 requests per 3 milliseconds + rule kind="any-matching-uri" pattern=r".*\.mp4" \ + tokens-per-bucket=50 refill-qty=2 refill-rate-ms=3 +} +``` + +For more details regarding implementation details and capabilities, refer to the [rate limiting] +section of the [River User Manual]. + +[rate limiting]: https://onevariable.com/river-user-manual/config/kdl.html#servicesnamerate-limiting + +#### CIDR range-based filtering for rejecting connections + +River now supports rejection of requests from entire CIDR ranges, specified using slash +notation, such as `192.168.0.0/16`. This has been implemented as a `path-control` filter +during the early `request-filter` stage, meaning that these requests can be rejected +extremely early in the request handling stage. + +These may be specified as in this example: + +```kdl +path-control { + request-filters { + filter kind="block-cidr-range" addrs="192.168.0.0/16, 10.0.0.0/8, 2001:0db8::0/32" + } + // ... +} +``` + +### Additional unplanned features landing in v0.5.0 + +We had one more miscellaneous feature that landed in this release that is worth mentioning: + +#### Support for HTTP2 connections + +Previously, we did not expose a way to specify whether HTTP1.0 or HTTP2.0 would be used for +either downstream or upstream connections. This meant that all requests were only handled +as HTTP1.0 connections. + +Downstream listeners can now be configured to offer HTTP2 support if clients would like to +upgrade: + +```kdl +listeners { + "0.0.0.0:8080" + "0.0.0.0:4443" cert-path="./assets/test.crt" key-path="./assets/test.key" offer-h2=true +} +``` + +Additionally, upstream connections can be set with any of the following options: + +* `h1-only`: Only HTTP1.0 will be used to connect +* `h2-only`: Only HTTP2.0 will be used to connect +* `h2-or-h1`: HTTP2.0 will be preferred, with fallback to HTTP1.0 + +For example: + +```kdl +connectors { + "91.107.223.4:443" tls-sni="onevariable.com" proto="h2-or-h1" +} +``` + +## Contributor thanks + +We'd like to thank [@branlwyd] for their first contribution to River in this release, +as well for their role in advising during these efforts. + +[@branlwyd]: https://github.com/branlwyd diff --git a/source/river/Cargo.toml b/source/river/Cargo.toml index f19e321..6d436f8 100644 --- a/source/river/Cargo.toml +++ b/source/river/Cargo.toml @@ -1,6 +1,6 @@ [package] name = "river" -version = "0.2.1" +version = "0.5.0" authors = [ "James Munns ",