diff --git a/docs/developer-docs/smart-contracts/test/reproducible-builds.mdx b/docs/developer-docs/smart-contracts/test/reproducible-builds.mdx index 5ce03ab2a3..6d09213129 100644 --- a/docs/developer-docs/smart-contracts/test/reproducible-builds.mdx +++ b/docs/developer-docs/smart-contracts/test/reproducible-builds.mdx @@ -11,79 +11,95 @@ import { GlossaryTooltip } from "/src/components/Tooltip/GlossaryTooltip"; ## Overview -Thanks to its consensus protocol, the Internet Computer always runs canister code correctly. But this doesn’t mean that it’s running the **correct** code for a canister. If you are using a canister somebody else developed, you may want to verify that the canister is indeed running the intended code before giving it control to make important decision for you, e.g., sending your ICP to another canister. Verifying this requires answering two questions: -1. Which WebAssembly (Wasm) code is being executed for a canister? -2. The canisters are normally written in a higher-level language, such as Motoko or Rust, and not directly in Wasm. The second question is then: is the Wasm that’s running really the result of compiling the purported source code? +If you are using a canister that you did not develop yourself, you may want to verify that the canister is running the code that you expect it to be before giving it control to make important decisions for you, such as accepting ICP for a payment. Verifying a canister's code requires confirming that the Wasm module is the correct result of compiling the canister source code and that the canister is in fact running that Wasm module and not another Wasm module. -The rest of the document answers these questions. For the first question, you will see how the Internet Computer provides information about canister code. To be able to answer the second question, canister authors should ensure trusted and [reproducible builds](https://reproducible-builds.org/docs/definition/) of the Wasm code from the source. Such builds allow anyone to follow the same steps as the canister authors and yield the exact same Wasm code, which can then be compared to the Wasm code executing in the canister on the Internet Computer. +It is therefore recommended that canister authors ensure trusted and [reproducible builds](https://reproducible-builds.org/docs/definition/) of the Wasm module are possible from their source code. Such builds allow anyone to follow the same steps as the canister authors and yield the exact same Wasm module, which can then be compared to the Wasm module executing on ICP. -The impatient reader who is familiar with the topic of reproducible builds can skip straight to the [conclusion](#repro-build-summary). +## Considerations for developers -## Finding out which canister code the Internet Computer is executing +When developing code meant to be reproducible, it is recommended to use [Docker or a similar technology](#build-environments-using-docker) to conveniently set up the operating system and the build tools and fix their versions for the user. If the build tools you are using don’t guarantee fully reproducible builds, Docker can also help by minimizing the differences in paths, environment variables, etc. The build tools and the base Docker image should be sourced from somewhere that the user can trust. Ideally, you want to minimize the number of dependencies, as the user may have to rebuild all of your dependencies to properly reproduce your build. -Internet Computer does not allow you to access the Wasm code of an arbitrary canister. This is a design decision, as developers might want to keep some code private. However, the Internet Computer does allow you to access the SHA-256 of the Wasm code of a canister. +However, build tools aren’t perfect, and may fail to ensure reproducible builds. If reproducibility is critical for your canister (e.g., it holds other users' funds), [test it](#testing-reproducibility). -To obtain this hash, you must first note the principal of the Internet Computer canister whose code you want to check. For example, assume we’re interested in the code of the Internet Identity canister, whose principal is `rdmx6-jaaaa-aaaaa-aaadq-cai`. Then, the easiest way to access this data is using the [IC SDK](docs/current/developer-docs/getting-started/install/) from the terminal. Open your terminal, and run: +## Obtaining the Wasm module hash -``` +ICP does not allow you to access the Wasm module of an arbitrary canister. This is a design decision, as developers might want to keep some code private. However, ICP does allow anyone to access the SHA-256 of the Wasm module. + +To obtain this hash, you must first note the principal of the canister whose code you want to check. For example, to check the code of the Internet Identity canister, the principal is `rdmx6-jaaaa-aaaaa-aaadq-cai`. Then, the easiest way to access this data is using [`dfx`](docs/current/developer-docs/getting-started/install/) from a command line using the following command: + +```bash dfx canister --network ic info rdmx6-jaaaa-aaaaa-aaadq-cai +``` + +This will return the controller(s) of the canister and the Wasm module hash: + +``` Controllers: r7inp-6aaaa-aaaaa-aaabq-cai -Module hash: 0xa9a5d0492d1455d6b6b84fdc102fb182f55975145a2f99a403eb10ea7e37aee6 +Module hash: 0x86ab08ea53e4da5bba4f27baa931e2edc5ab2a1f228c204a5340992c16389f66 ``` -If you are running an older version of the IC SDK, you will need to run this command from a directory that contains a valid `dfx.json` file. If you don’t have such a directory, you can create it using `dfx new`. +You will need to run this command from a directory that contains a valid `dfx.json` file. If you don’t have such a directory, you can create one using `dfx new`. -Here, the Internet Computer tells us that the hash of the Wasm module of the `rdmx6-jaaaa-aaaaa-aaadq-cai` canister (which happens to be the Internet Identity canister) is `0xa9a5d0492d1455d6b6b84fdc102fb182f55975145a2f99a403eb10ea7e37aee6`. +Here, ICP tells us that the hash of the Wasm module of the `rdmx6-jaaaa-aaaaa-aaadq-cai` canister is `0x86ab08ea53e4da5bba4f27baa931e2edc5ab2a1f228c204a5340992c16389f66`. -The check above provides you the **current** hash of the canister’s Wasm module, but the **controllers** of an Internet Computer canister may change the code at any time (e.g., to upgrade a canister). However, if the list of controllers is empty or the only controller is a [black-hole](https://github.com/ninegua/ic-blackhole) canister, you know that the canister is immutable since nobody has the power to change the code. +The check above provides you the **current** hash of the canister’s Wasm module, but the **controllers** of a canister may change the code at any time, such as to upgrade the canister. However, if the list of controllers is empty or the only controller is a [blackhole](https://github.com/ninegua/ic-blackhole) canister, you know that the canister is immutable since no one has the power to change the code. -Armed with this hash, you can next check whether it corresponds to some given source code. This only works if the build process for the code is reproducible. +## Verifying the build is reproducible -## Reproducible builds +Next, check whether this hash corresponds to some given source code. This only works if the build process for the code is reproducible. As a canister author, there are a few things you have to provide to your users to allow them to reproduce your build: -- The same source code that you used to create the Wasm module for the canister. - -- Instructions on how to recreate your build environment. +-   The same source code that you used to create the Wasm module for the canister. -- Instructions on how to repeat the process of building the Wasm from the source code. Crucially, the process must be deterministic, to ensure that it results in the exact same Wasm. It also has to be trusted, such that the user can be convinced that the Wasm is a faithful translation of the source code, and not an artifact of a malicious build tool. In particular, look for `.dfx`, `node_modules`, and `target` directories that could contain pre-built files. +-   Instructions on how to recreate your build environment. -Next, you will look at each of these points in more detail. +-   Instructions on how to repeat the process of building the Wasm from the source code. The process must be deterministic to ensure that it results in the exact same Wasm. It also has to be trusted, such that the user can be convinced that the Wasm is a faithful translation of the source code and not an artifact of a malicious build tool. In particular, look for `.dfx`, `node_modules`, and `target` directories that could contain prebuilt files. -### Providing source code +### Providing the source code -Typically, you will version your code in `git` or some other version control system, and your versioned code may be available in a public repository, e.g., GitHub. In this case, you should note the particular commit that you used when producing the code to be deployed to the Internet Computer, and communicate it to the users who are trying to verify your canister. Alternatively, you can also provide a package (e.g., a zip file or a tarball) containing the source code you used to build your canister. +Typically, it is recommended that developers version your code in `git` or some other version control system, and versioned code may be available in a public repository such as GitHub. In this case, as the developer, you should note the particular commit that was used when producing the code that has been deployed to ICP and communicate it to the users who may want to verify your canister. Alternatively, you can also provide a package (i.e., a zip file or a tarball) containing the source code you used to build your canister. ### Reproducing build environments -Before building your code, you should document the build environment you are using in detail. In particular, for the languages supported by the Internet Computer SDK: +Before building your code, you should document the build environment you are using in detail. In particular, for the languages supported by the IC SDK (Motoko, Rust, TypeScript, Python): -- Note down the operating system and its version you are using to build your canister. +-   Note your local machine's operating system and its version. -- If you are using the IC SDK, note the version used, as specified in `dfx.json`. You can install arbitrary versions of the IC SDK using `dfx toolchain install `, or by running the installation script with the `DFX_VERSION` environment variable set to the desired version. +-   If you are using the IC SDK, note the version used. You can install arbitrary versions of the IC SDK using [`dfxvm`](/docs/current/developer-docs/developer-tools/cli-tools/dfxvm/docs/cli-reference/dfxvm/dfxvm-install). -- If you are building Motoko code in a way other than `dfx build`, note the version of `moc` you are using. +-   If you are building Motoko code in a way other than `dfx build`, note the version of `moc` you are using. -- If you are building Rust, note the version of `cargo` you are using. +-   If you are building Rust, note the version of `cargo` you are using. -- If you are using Node.js and/or `webpack` for frontend development, note their versions. +- If you are building TypeScript, note the version of [Azle](https://demergent-labs.github.io/azle/). -- If your build process depends on any environment variables (such as the time zone or locale), note them down. +- If you are building Python, note the version of Python and [Kybra](https://demergent-labs.github.io/kybra/). -You should communicate all of these to your user in the instructions. Ideally, do so by providing an executable recipe to recreate the build environment, using tools such as Docker or Nix. It is recommended to use Docker, as this allows you to also pinpoint the operating system used for building the software. +-   If you are using a framework for frontend development, such as React, Vite, or webpack, note their versions. + +-   If your build process depends on any environment variables (such as the timezone or locale), note them. + +You should communicate all of these to your user in the reproducible build instructions. Ideally, provide an executable recipe or script to recreate the build environment using tools such as Docker or Nix. It is recommended to use Docker, as it allows you to also pinpoint the operating system used for building the software. ### Build environments using Docker -[Docker containers](https://docs.docker.com/) are a popular solution for providing build environments. For developers using OS X, it is recommended to install Docker using [lima](https://github.com/lima-vm/lima), as it proves more stable than Docker Desktop or Docker Machine, in particular, it avoids some QEMU bugs on Apple M1 machines. +[Docker containers](https://docs.docker.com/) are a popular solution for providing reproducible build environments. -After setting Docker up, you can use a `Dockerfile` such as the following to provide the user with a particular version of the operating system, as well as `dfx`, Node.js and the Rust toolchain. Make sure to stick with `x86_64` for running the Docker container, as builds are generally not reproducible across architectures. See [docs](https://github.com/lima-vm/lima/blob/master/docs/multi-arch.md) on setting up cross-platform Docker containers in case your host environment is not `x86_64`. +:::info +For developers using macOS, it is recommended to install Docker using [lima](https://github.com/lima-vm/lima), as it proves more stable than Docker Desktop or Docker Machine. In particular, it avoids some QEMU bugs on Apple M1 machines. +::: -This Dockerfile is an example of how a Rust build environment can be standardized using Docker. +After setting Docker up, you can use a `Dockerfile` to provide the user with a particular version of the operating system, as well as other necessary tooling, such as `dfx`, Node.js, and the Rust toolchain. -#### Example Dockerfile -```dockerfile +:::caution +It is advised to stick with `x86_64` for running the Docker container, as builds are generally not reproducible across architectures. See the [docs](https://github.com/lima-vm/lima/blob/master/docs/multi-arch.md) on setting up cross-platform Docker containers in case your host environment is not `x86_64`. +::: + +Below is an example Dockerfile that creates a standardized Rust build environment. + +```dockerfile title="Dockerfile" FROM ubuntu:22.04 ENV NVM_DIR=/root/.nvm @@ -94,12 +110,12 @@ ENV RUSTUP_HOME=/opt/rustup ENV CARGO_HOME=/opt/cargo ENV RUST_VERSION=1.62.0 -ENV DFX_VERSION=0.14.1 +ENV DFX_VERSION=0.23.0 -# Install a basic environment needed for our build tools +# Install the basic environment needed for our build tools. RUN apt -yq update && \ - apt -yqq install --no-install-recommends curl ca-certificates \ - build-essential pkg-config libssl-dev llvm-dev liblmdb-dev clang cmake rsync +    apt -yqq install --no-install-recommends curl ca-certificates \ +        build-essential pkg-config libssl-dev llvm-dev liblmdb-dev clang cmake rsync # Install Node.js using nvm ENV PATH="/root/.nvm/versions/node/v${NODE_VERSION}/bin:${PATH}" @@ -111,10 +127,10 @@ RUN . "${NVM_DIR}/nvm.sh" && nvm alias default v${NODE_VERSION} # Install Rust and Cargo ENV PATH=/opt/cargo/bin:${PATH} RUN curl --fail https://sh.rustup.rs -sSf \ - | sh -s -- -y --default-toolchain ${RUST_VERSION}-x86_64-unknown-linux-gnu --no-modify-path && \ - rustup default ${RUST_VERSION}-x86_64-unknown-linux-gnu && \ - rustup target add wasm32-unknown-unknown &&\ - cargo install ic-wasm +        | sh -s -- -y --default-toolchain ${RUST_VERSION}-x86_64-unknown-linux-gnu --no-modify-path && \ +    rustup default ${RUST_VERSION}-x86_64-unknown-linux-gnu && \ +    rustup target add wasm32-unknown-unknown &&\ +    cargo install ic-wasm # Install dfx RUN sh -ci "$(curl -fsSL https://internetcomputer.org/install.sh)" @@ -122,29 +138,43 @@ RUN sh -ci "$(curl -fsSL https://internetcomputer.org/install.sh)" COPY . /canister WORKDIR /canister ``` + #### What this does + There are a couple of things worth noting about this `Dockerfile`: -- It starts from an official Docker image. Furthermore, all the installed tools are standard, and come from standard sources. This provides the user with confidence that the build environment hasn’t been tampered with, and thus that the build process using Docker can be trusted. +-   It starts from an official Docker image, such that all the installed tools are standard and come from standard sources. This provides the user with confidence that the build environment hasn’t been tampered with, and thus that the build process using Docker can be trusted. -- To ensure that specific versions of the build tools are installed, it installs them directly, rather than through `apt` (the package manager of Ubuntu, the Linux distribution running inside of the container). Such package managers usually don’t provide a way of pinning the build tools to specific versions. +-   To ensure that specific versions of the build tools are installed, it installs them directly rather than through a package manager. Package managers usually don’t provide a way of pinning the build tools to specific versions. To use this `Dockerfile`, get Docker [up and running](https://docs.docker.com), place the `Dockerfile` in the project directory of your canister, and create the Docker container by running: -``` + +```bash docker build -t mycanister . ``` -This creates a Docker container image called `mycanister`, with Node.js, Rust and `dfx` installed in it, and your canister source code copied to `/canister` (recall that you should invoke `docker build` from the canister project directory). You can then enter an interactive shell inside of your container by running: -``` + +This creates a Docker container image called `mycanister`, with Node.js, Rust, and `dfx` installed in it. Your canister source code will be copied to the directory `/canister` You should invoke `docker build` from the canister project directory. You can then enter an interactive shell inside of your Docker container by running: + +```bash docker run -it --rm mycanister ``` -From here, you can experiment with the steps needed to build your canister. Once you are confident that the steps are deterministic, you can also put them in the `Dockerfile` (e.g., as `RUN ./build_script.sh` for a build script such as the example below, saved as `./build_script.sh`), to allow the user to automatically reproduce the build of your canister. You can see an example in the [Dockerfile of the Internet Identity canister](https://github.com/dfinity/internet-identity/blob/397d0087a29855564c47f0fd3323f60b5b67a8fa/Dockerfile). -#### Example script for building a Rust project -An example script to build a Rust project looks as follows: +From here, you can experiment with the steps needed to build your canister. Once you are confident that the steps are deterministic, you can also put them in the `Dockerfile`. For example, to run a build script, add the line: + +```dockerfile +RUN ./build_script.sh +``` + +You can see an example in the [Dockerfile of the Internet Identity canister](https://github.com/dfinity/internet-identity/blob/397d0087a29855564c47f0fd3323f60b5b67a8fa/Dockerfile). + +### Example build script + +Below is an example script that can be used to programmatically build a Rust project: :::info -This is an example build script for a Rust project, and does not include a fully comprehensive example of what a build script may contain, nor does it include build instructions for other languages such as Motoko. +This is an example build script for a Rust project and does not include a fully comprehensive example of what a build script may contain, nor does it include build instructions for other languages such as Motoko. ::: + ```bash #!/bin/bash # @@ -155,57 +185,74 @@ export RUSTFLAGS="--remap-path-prefix $(readlink -f $(dirname ${0}))=/build --re cargo build --locked --target wasm32-unknown-unknown --release ic-wasm target/wasm32-unknown-unknown/release/example_backend.wasm -o example_backend.wasm shrink ``` -A build script such as the one above can be specified as a custom build script in `dfx.json` as follows: -``` + +A build script such as the one above can be specified as a custom build script in `dfx.json`: + +```json title="dfx.json" "canisters": { - "example_backend": { - "candid": "src/example_backend/example_backend.did", - "package": "example_backend", - "type": "custom", - "wasm": "./example_backend.wasm", - "build": "./build_script.sh" - } +  "example_backend": { +    "candid": "src/example_backend/example_backend.did", +    "package": "example_backend", +    "type": "custom", +    "wasm": "./example_backend.wasm", +    "build": "./build_script.sh" +  } } ``` ### Ensuring the determinism of the build process -Next, investigate what is necessary to make the build deterministic. For the build process to be deterministic: -- #### Step 1: You will need to ensure that any dependencies of your canister are always resolved in the same way. Most build tools now support a way of pinning dependencies to a particular version. +Next, consider if it is necessary to make the build deterministic. For the build process to be deterministic: + +- #### Step 1: You will need to ensure that any dependencies of your canister are always resolved in the same way. Most build tools support a way of pinning dependencies to a particular version. + +    -   For `npm`, running `npm install` will create a `package-lock.json` file with some fixed versions of all (transitive) dependencies of your project that satisfy the requirements specified in your `package.json`. However, `npm install` will overwrite the `package-lock.json` file every time it is invoked. Thus, once you are ready to create the final version of your canister, run `npm install` only once. After that, commit `package-lock.json` to your version control system. Finally, when checking the build for reproducibility, use `npm ci` instead of `npm install`. + +    -   For Rust code, Cargo will automatically generate a `Cargo.lock` file with the fixed versions of your (transitive) dependencies. Like with `package-lock.json`, you should commit this file to your version control system once you are ready to produce the final version of your canister. Furthermore, Cargo by default ignores the locked versions of dependencies. Pass the `--locked` flag to the `cargo` command to ensure that the locked dependencies are used. + +- #### Step 2: Your own build scripts must not introduce non-determinism. + +Sources of non-determinism include randomness, timestamps, concurrency, or code obfuscators. Less obvious sources include locales, absolute file paths, the order of files in a directory, and remote URLs whose content can change. Furthermore, relying on third-party build plug-ins exposes you to any non-determinism introduced by these. + +- #### Step 3. Given the same dependencies and deterministic build scripts, the build tools themselves (`moc` for Motoko, `cargo` for Rust, `npx` for TypeScript, `pip` for Python, or `webpack` for frontend development) must also be deterministic. + +These tools aim to be deterministic. However, they are complicated pieces of software, and ensuring determinism is non-trivial. Thus, non-determinism bugs can and do occur. + +#### Motoko deterministic considerations - - For `npm`, running `npm install` will create a `package-lock.json` file with some fixed versions of all (transitive) dependencies of your project that satisfy the requirements specified in your `package.json`. However, `npm install` will overwrite the `package-lock.json` file every time it is invoked. Thus, once you are ready to create the final version of your canister, run `npm install` only once. After that, commit `package-lock.json` to your version control system. Finally, when checking the build for reproducibility, use `npm ci` instead of `npm install`. +The Motoko compiler aims to be deterministic and reproducible; if you find reproducibility issues, please submit a [new issue](https://github.com/dfinity/motoko/issues/new/choose), and the team will try to address them to the extent possible. - - For Rust code, Cargo will automatically generate a `Cargo.lock` file with the fixed versions of your (transitive) dependencies. Like with `package-lock.json`, you should commit this file to your version control system once you are ready to produce the final version of your canister. Furthermore, Cargo by default ignores the locked versions of dependencies. Pass the `--locked` flag to the `cargo` command to ensure that the locked dependencies are used. +#### Rust deterministic considerations - - You have to allocate canister IDs in advance, as canisters refer to each other by their IDs. +For Rust, see the [list of current potential non-determinism issues in Rust](https://github.com/rust-lang/rust/labels/A-reproducibility). If you have observed differences between Rust code compiled to Wasm under Linux and macOS, it is recommended to pin the build platform and its version. -- #### Step 2: Your own build scripts must not introduce non-determinism. -Obvious sources of non-determinism include randomness, timestamps, concurrency, or code obfuscators. Less obvious sources include locales, absolute file paths, order of files in a directory, and remote URLs whose content can change. Furthermore, relying on third-party build plug-ins exposes you to any non-determinism introduced by these. +#### Webpack deterministic considerations -- #### Step 3. Given the same dependencies and deterministic build scripts, the build tools themselves (`moc` for Motoko, `cargo` for Rust, `webpack` by default for frontend development) must also be deterministic. -The good news is that all of these tools aim to be deterministic. However, they are complicated pieces of software, and ensuring determinism is non-trivial. Thus, non-determinism bugs can and do occur. For Rust, see the [list of current potential non-determinism issues in Rust](https://github.com/rust-lang/rust/labels/A-reproducibility). Furthermore, you have observed differences between Rust code compiled to Wasm under Linux and macOS, and thus it is recommended to pin the build platform and its version. For webpack, [deterministic naming of module and chunk IDs](https://webpack.js.org/configuration/optimization/) that you should use have been introduced since version 5. The Motoko compiler aims to be deterministic and reproducible; if you find reproducibility issues, please submit a [new issue](https://github.com/dfinity/motoko/issues/new/choose), and the team will try to address them to the extent possible. +For webpack, [deterministic naming of module and chunk IDs](https://webpack.js.org/configuration/optimization/) that you should use have been introduced since version 5. ### Testing reproducibility -If reproducibility is vital for your code, you should test your builds to increase your confidence in their reproducibility. Such testing is non-trivial: there have been real-world examples where non-determinism in a canister build took a month to show up! Fortunately, the **Debian Reproducible Builds** project created a tool called [reprotest](https://salsa.debian.org/reproducible-builds/reprotest), which can help you automate reproducibility tests. It tests your build by running it in two different environments that differ in characteristics such as paths, time, file order, and others, and comparing the results. To check your build with `reprotest`, add the following line to your `Dockerfile`: +If reproducibility is vital for your code, you should test the build to increase your confidence in its reproducibility. The tool [reprotest](https://salsa.debian.org/reproducible-builds/reprotest) can help you automate reproducibility tests, as it tests your build by running it in two different environments that differ in characteristics such as paths, time, file order, and others, and comparing the results. To check your build with `reprotest`, add the following line to your `Dockerfile`: + ```dockerfile RUN apt -yqq install --no-install-recommends reprotest disorderfs faketime rsync sudo wabt ``` -When using `dfx build --network ic`, you need to prebuild your frontend dependencies (e.g., by running `npm ci` before `dfx build --network ic` or by setting the custom build type in `dfx.json` and running `npm ci` in your build script) and your project directory should contain a `canister_ids.json` file containing the IDs of your canisters on the Internet Computer. An example `canister_ids.json` file looks as follows: -```json +When using `dfx build --network ic`, you need to prebuild your frontend dependencies by running `npm ci` before `dfx build --network ic` or by setting the custom build type in `dfx.json` and running `npm ci` in your build script. Your project directory should contain a `canister_ids.json` file containing the IDs of your canisters on the mainnet. Below is an example `canister_ids.json` file: + +```json title="canister_ids.json" { - "example_backend": { - "ic": "rrkah-fqaaa-aaaaa-aaaaq-cai" - }, - "example_frontend": { - "ic": "ryjl3-tyaaa-aaaaa-aaaba-cai" - } +  "example_backend": { +    "ic": "rrkah-fqaaa-aaaaa-aaaaq-cai" +  }, +  "example_frontend": { +    "ic": "ryjl3-tyaaa-aaaaa-aaaba-cai" +  } } ``` -Now, from the root directory of your canister project, you can test the reproducibility of your IC SDK builds as follows: +Now, from the root directory of your canister project, you can test the reproducibility of your build using the commands: ``` docker build -t mycanister . @@ -216,9 +263,13 @@ docker run --rm --privileged -it mycanister #### What this does - The first command builds the Docker container using the `Dockerfile` provided earlier. -- The second one opens an interactive shell (hence the `-it` flags) in the container. You can run this in privileged mode (the `--privileged` flag), as `reprotest` uses kernel modules for some build environment variations. You can also run it in non-privileged mode by excluding some of the variations; see the [reprotest manual](https://manpages.debian.org/stretch/reprotest/reprotest.1.en.html). The `--rm` flag will destroy the container after you close its shell. -- Once inside of the container, create a directory for the build artifacts and launch `reprotest` in verbose mode (the `-vv` flags). You need to give it the build command you want to run as the first argument. Here, assume that it’s `dfx build --network ic` - adjust it if you’re using a different build process. It will then run the build in two different environments. -- Finally, you need to tell `reprotest` which paths to compare at the end of the two builds. Here, compare the Wasm code for all canisters, which is found in the `.dfx/ic` directory. This workflow omits the time variation because the Rust compiler uses `jemalloc` for dynamic memory allocation and this library is not [compatible](https://github.com/wolfcw/libfaketime/issues/130) with `faketime` used by `reprotest` to implement the time variation. Nevertheless, it is encouraged that you to compare the artifacts produced by `reprotest` while manually changing your system time. +- The second command opens an interactive shell in the container, indicated by the flag `-it`. You can run this in privileged mode using the `--privileged` flag, as `reprotest` uses kernel modules for some build environment variations. You can also run it in non-privileged mode by excluding some of the variations; see the [reprotest manual](https://manpages.debian.org/stretch/reprotest/reprotest.1.en.html). The `--rm` flag will destroy the container after you close its shell. +- Once inside the container, a directory is created for the build artifacts, and `reprotest` is launched in verbose mode using the `-vv` flags. You need to give it the build command you want to run as the first argument; in this example it is `dfx build --network ic`. It will then run the build in two different environments. +- Finally, you need to tell `reprotest` which paths to compare at the end of the two builds. In this example, it compares the Wasm code for all canisters, which is found in the `.dfx/ic` directory. It is encouraged that you compare the artifacts produced by 'reprotest` while manually changing your system time. + +:::info +This workflow omits the time variation because the Rust compiler uses `jemalloc` for dynamic memory allocation, and this library is not [compatible](https://github.com/wolfcw/libfaketime/issues/130) with `faketime` used by `reprotest` to implement the time variation. +::: If the comparison doesn’t find any differences, you will see an output similar to this one: @@ -227,55 +278,38 @@ If the comparison doesn’t find any differences, you will see an output similar Reproduction successful ======================= No differences in ./.dfx/ic/canisters/*/*.wasm -6b2a15a918219138836e88e9c95f9c5d2d7b6d465df83ae05d6fd2b0f14f8a97 ./.dfx/ic/canisters/example_backend/example_backend.wasm -a047686c1d517e21d447bcd42c9394a12cdb240e06425b830c99d3a689b5ee20 ./.dfx/ic/canisters/example_frontend/assetstorage.wasm -a047686c1d517e21d447bcd42c9394a12cdb240e06425b830c99d3a689b5ee20 ./.dfx/ic/canisters/example_frontend/example_frontend.wasm +6b2a15a918219138836e88e9c95f9c5d2d7b6d465df83ae05d6fd2b0f14f8a97  ./.dfx/ic/canisters/example_backend/example_backend.wasm +a047686c1d517e21d447bcd42c9394a12cdb240e06425b830c99d3a689b5ee20  ./.dfx/ic/canisters/example_frontend/assetstorage.wasm +a047686c1d517e21d447bcd42c9394a12cdb240e06425b830c99d3a689b5ee20  ./.dfx/ic/canisters/example_frontend/example_frontend.wasm ``` -Congratulations - this is a good indicator that your build is not affected by your environment! +This is a good indicator that your build is not affected by your environment. :::info -Note that `reprotest` can’t check that your dependencies are pinned properly; use guidelines from the previous section for that. Moreover, it is recommended that you to run the container `reprotest` builds under several host operating systems and compare the results. If the comparison does find differences between the Wasm code produced in two builds, it will output a diff. You will then likely want to use the `--store-dir` flag of `reprotest` to store the outputs and the diff somewhere where you can analyze them. If you are struggling to achieve reproducibility, consider also using [DetTrace](https://github.com/dettrace/dettrace), which is a container abstraction that tries to make arbitrary builds deterministic. +Note that `reprotest` can’t check that your dependencies are pinned properly. It is recommended that you run the container `reprotest` builds under several host operating systems and compare the results. If the comparison does find differences between the Wasm code produced in two builds, it will output a diff. You will then likely want to use the `--store-dir` flag of `reprotest` to store the outputs and the diff somewhere where you can analyze them. If you are struggling to achieve reproducibility, consider also using [DetTrace](https://github.com/dettrace/dettrace), which is a container abstraction that tries to make arbitrary builds deterministic. ::: -Even after you achieve reproducibility for your builds, there are still other things to consider for the long term. - -### Long-term considerations - -Reproducibility can be more demanding if you expect your canister code to stay around for years, and stay reproducible. The biggest challenges are to ensure that your: - -- Build toolchain is still available in the future. - -- Dependencies are available. - -- Toolchain still runs and still correctly builds your dependencies. - -Distributions and package archives may drop old versions of packages, including both your toolchain and their dependencies. Web sites may go offline and URLs might stop working. Thus, it’s prudent to back up all of your toolchain and dependencies. You should consider getting involved in projects such as [Software Heritage](https://www.softwareheritage.org/), which do this on a large scale. At some later point in time, you might have to adjust your build process (e.g., by changing URLs) to ensure that your canister still builds. Even if the build changes, if it still yields the same result, your users can be confident that your canister is running the correct code. The trust argument is easier if your dependencies come from a trustworthy source, such as the Software Heritage project. - -## Conclusion - -Summarizing our recommendations for canister authors: +Finally, if your build is reproducible, you can compare the hash of the resulting Wasm code to the hash of the code that is running in a canister, which you retrieve as follows: -- Ideally, when producing the final version of your container code, use Docker or a similar technology to conveniently set up the operating system and the build tools, and fix their versions for the user. If the build tools you are using don’t guarantee fully reproducible builds, Docker can also help by minimizing the differences in paths, environment variables etc. +``` +dfx canister --network ic info +``` -- The build tools and the base Docker image should be sourced from somewhere that the user can trust. +Beware that this hash might change if the controllers upgrade the canister code. -- Rust and Motoko compilers aim to be deterministic, and thus to support reproducible builds. If you notice non-determinism, file bug reports. +## Long-term considerations -- When using NPM, ensure that you specify the exact versions of all your dependencies (commit `package_lock.json` to your git repo!). Invoke NPM using the `ci` command rather than `install` to reproduce the build. Similarly, for Rust packages commit `Cargo.lock` to your repository, and then use `cargo build --locked` when building the package. +Even after you achieve reproducibility for your builds, there are still other things to consider for the long term. -- Webpack builds should be deterministic, but obfuscators and similar tools may compromise reproducibility. Make sure you use deterministic chunk and module IDs. +Reproducibility can be more demanding if you expect your canister code to stay around for years and remain reproducible. The biggest challenges are to ensure that your: -- Build tools aren’t perfect, and may fail to ensure reproducible builds. If reproducibility is critical for your canister (e.g., it holds other users' funds), test it. Reprotest is a useful tool for this purpose. +-  Build toolchain is available in the future. -- Ideally, you want to minimize the number of dependencies, as, in order to do a full audit, the user may have to (reproducibly) rebuild all of your dependencies too. +-  Dependencies are available. -- Achieving reproducibility is harder over longer time scales, primarily as you need to ensure that a trustworthy source of your dependencies and build tools stays available. +-  Toolchain still runs and correctly builds your dependencies. -Finally, if your build is reproducible, you can compare the hash of the resulting Wasm code to the hash of the code that is running in a canister, which you retrieve as follows: +Distributions and package archives may drop old versions of packages, including both your toolchain and their dependencies. Web sites may go offline, and URLs might stop working. It’s prudent to back up all of your toolchain and dependencies. You should consider getting involved in projects such as [Software Heritage](https://www.softwareheritage.org/), which do this on a large scale. -``` -dfx canister --network ic info -``` +You might have to adjust your build process to ensure that your canister still builds. Even if the build changes, if it still yields the same result, your users can be confident that your canister is running the correct code. The trust argument is easier if your dependencies come from a trustworthy source, such as the Software Heritage project. -Beware that this hash might change if the controllers upgrade the canister code.