Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pnpm cache race condition when using different node versions between workflows #641

Open
2 of 5 tasks
allbetter-max opened this issue Dec 13, 2022 · 12 comments
Open
2 of 5 tasks
Labels
feature request New feature or request to improve the current logic needs eyes

Comments

@allbetter-max
Copy link

Description:

Same cache us being reused, regardless of node versions. This brings prolonged time for once of the operations, and cache race conditions.

As can be seen from the images attached below, same cache key is used for both, but once is using node14, while the other is using node16.

Action version:
actions/checkout@v3

Platform:

  • Ubuntu
  • macOS
  • Windows

Runner type:

  • Hosted
  • Self-hosted

Tools version:

Repro steps:

This is my Tests yaml

name: Tests

on:
  push:
    branches:
      - master
  pull_request:

jobs:
  exec:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
        with:
          fetch-depth: 0

      - uses: pnpm/action-setup@v2
        with:
          version: 7

      - name: Install Node.js
        uses: actions/setup-node@v3
        with:
          node-version: 14
          cache: "pnpm"

image

This is my Version yaml

name: Version

on:
  push:
    branches:
      - master

concurrency: ${{ github.workflow }}-${{ github.ref }}

jobs:
  version:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
        with:
          fetch-depth: 0

      - uses: pnpm/action-setup@v2
        with:
          version: 7

      - name: Install Node.js
        uses: actions/setup-node@v3
        with:
          node-version: 16
          cache: "pnpm"

image

Expected behaviour:

Node version is taken into account on the cache generation key.

Actual behaviour:

Same cache is being used, which leads to longer pnpm install steps, and cache race conditions.

@allbetter-max allbetter-max added bug Something isn't working needs triage labels Dec 13, 2022
@allbetter-max
Copy link
Author

Possibly related #286

@dsame
Copy link
Contributor

dsame commented Dec 14, 2022

Hello @allbetter-max , thank you for your input we started to investigate this issue

@dsame dsame self-assigned this Dec 14, 2022
@maximveksler
Copy link

@dsame thank you, highly appreciated.

If you are able to provide an ETA or a workaround that would be great.

@dsame
Copy link
Contributor

dsame commented Dec 15, 2022

Hello @maximveksler, can you please give us your estimations of "prolonged time" and a sample of race condiitons.

From that we have now the time used for sharing the common cache is less then downloading 2 caches and having a common pnpm cache instead of per-project node_modules is exactly that pnpm is intended for.

If you have some issues specific use cases the https://github.com/actions/cache action should be used for fine turning.

@dsame dsame added question Further information is requested and removed bug Something isn't working labels Dec 15, 2022
maximveksler added a commit to maximveksler/actions-setup-node-641 that referenced this issue Dec 15, 2022
@maximveksler
Copy link

maximveksler commented Dec 15, 2022

@dsame please see reproduction and analay of the bug,

This repo contains

2 workflows

  • "Tests" which uses node version 14
  • "Version" which uses node version 16

1 app

Called apps/demo, which has 1 dependecy on "hummus-recipe": "^2.0.1" which in turn dependes on a pacakge called muhammara which resovled to muhammara 1.10.0. This dependecy is relevant for the bug reproductions reasons explained in the following section.

pnpm why muhammara             
Legend: production dependency, optional only, dev only

demo@1.0.0 /home/m/code/pr/CoreLess/apps/demo

dependencies:
hummus-recipe 2.0.1
└── muhammara 1.10.0

Bug Reproduction

In total 3 commits were performed into this repo:

6f80832

maximveksler/actions-setup-node-641@6f80832 is just the initial setup, and is irrelevant in this context.

9891f00

maximveksler/actions-setup-node-641@9891f00 which triggers the action's cache generation.

In this commit the 2 workflows have ran,

Tests which uses node version 14, it takes 15 sec to run.

This is bacuse muhammara 1.10.0 has precompiled build for node14 (as can be seen from the build logs

In addition it generated cahce key

Cache saved with the key: node-cache-Linux-pnpm-e586cd84c730ed875535ade63acd6a4c8d3b7de38c5670fe0c6c6b9e8636[1](https://github.com/maximveksler/actions-setup-node-641/actions/runs/3703277928/jobs/6274530259#step:9:1)b6d

Version which uses node version 16, it takes 2m 55s to run.

This is because muhammara 1.10.0 does not have a precompiled build for node16 (as can be seen from the build logs

.../node_modules/muhammara install: node-pre-gyp WARN Pre-built binaries not found for muhammara@1.10.0 and node@16.18.1 (node-v93 ABI, glibc) (falling back to source compile with node-gyp) 

So the job spends the 2:55 minutes no compliing the package.

In addition it tries to cache the generated pnpm cahce, but correctly fails (because such cahce key already exists, as it was generated by Tests)

Failed to save: Unable to reserve cache with key node-cache-Linux-pnpm-e586cd84c730ed875535ade63acd6a4c8d3b7de38c5670fe0c6c6b9e86361b6d, another job may be creating this cache. More details: Cache already exists. Scope: refs/heads/master, Key: node-cache-Linux-pnpm-e586cd84c730ed875535ade63acd6a4c8d3b7de38c5670fe0c6c6b9e86361b6d, Version: 578cf7a2de7d6b93a288818f024d6aa54ad870bb629c8ef7a6b5a4f1fb065d59

3b1dff9

maximveksler/actions-setup-node-641@3b1dff9 which attempts to use the cache key `` for both workflows.

In this commit the 2 workflows have ran again,

Tests which uses node version 14, it takes 13 sec to run.

It restore the cache key from the previous run, and pnpm uses it as expected:

Cache restored from key: node-cache-Linux-pnpm-e586cd84c730ed875535ade63acd6a4c8d3b7de38c5670fe0c6c6b9e86361b6d

Run pnpm install --aggregate-output --reporter append-only --frozen-lockfile
Scope: all 2 workspace projects
Lockfile is up to date, resolution step is skipped
Progress: resolved 1, reused 0, downloaded 0, added 0
Packages: +142
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Packages are hard linked from the content-addressable store to the virtual store.
  Content-addressable store is at: /home/runner/setup-pnpm/node_modules/.bin/store/v3
  Virtual store is at:             node_modules/.pnpm
Progress: resolved 142, reused 142, downloaded 0, added 142, done

Version which uses node version 16, it takes 3m 58s to run.

Here the cache is also restored, much like the in Tests workflow

Cache restored from key: node-cache-Linux-pnpm-e586cd84c730ed875535ade63acd6a4c8d3b7de38c5670fe0c6c6b9e86361b6d

However pnpm does not finds the complied muhammara and goes to recompile it again from scratch.

Root Cause

The bug is that it finds a cache to restore, however it's not the relevant cache because not all relevant variables are taken into account upon the cache token generation.

Therefor the pnpm install phase can't use the cached compile from the previous run, and needs to do the compilation again.

I belive that a simple solution would be to add to the hashed cahce token the version of the node version and the glibc version being used on the system. This would help prevent false positive of cache hits.

@maximveksler
Copy link

To prove that Version can be cached,

Please:

  1. Delete the "Tests" workflow
  2. Invalidate the cache
  3. Run the "Version" once for cahce generation
  4. Run the "Version" workflow a second time for cahce usage.

I won't be doing that in the repo, to not confuse the reader.

@dsame
Copy link
Contributor

dsame commented Dec 19, 2022

Hello @maximveksler , thanks a lot for your explanation, now i understand the race condition but it does not seem to me it directly relates to the node version but rather it relates to the running workflows in parallel having the same lock file.

The node version plays the role only if the same lock file resolves to the binary packages that is not built, i agree.
To add the node version key seems to be a feature request, that needs to be discussed so the ETA is up to a month and it may be rejected.

Meanwhile i advise you to utilize the actions/cache that allows to set the custom cache.

@dsame dsame added feature request New feature or request to improve the current logic and removed question Further information is requested labels Dec 19, 2022
@maximveksler
Copy link

Thank you.

Meanwhile i advise you to utilize the actions/cache that allows to set the custom cache.

Sure, I'll cache externally. Are you able to suggest the best approach for caching a pnpm store as a workaround?

@dsame
Copy link
Contributor

dsame commented Dec 20, 2022

@maximveksler

This step is proven to work. It is not perfect because of hard-coded cached folder path, but it can be temporary solution until the feature request is not implemented.

      - name: cache dependencies
        uses: actions/cache@v3
        with:
          path: '/home/runner/.local/share/pnpm'
          key: 'pnpm-deps'

@maximveksler
Copy link

This step is proven to work. It is not perfect because of hard-coded cached folder path, but it can be temporary solution until the feature request is not implemented.

      - name: cache dependencies
        uses: actions/cache@v3
        with:
          path: '/home/runner/.local/share/pnpm'
          key: 'pnpm-deps'

The following setup worked for me, as pnpm is being installed using pnpm/action-setup

      - uses: pnpm/action-setup@v2
        with:
          version: 7

      - name: Install Node.js
        uses: actions/setup-node@v3
        with:
          node-version: 16

      - name: pnpm cache
        uses: actions/cache@v3
        with:
          path: "/home/runner/setup-pnpm/node_modules/.bin/store/v3"
          key: "pnpm-deps-node16"

      - run: pnpm install --aggregate-output --reporter append-only --frozen-lockfile

@maximveksler
Copy link

Even better

    steps:
      - uses: actions/checkout@v3
        with:
          fetch-depth: 0

      - name: Install Node.js
        id: setup-node
        uses: actions/setup-node@v3
        with:
          node-version: 16

      - name: Setup pnpm
        uses: pnpm/action-setup@v2
        with:
          version: 7

      - id: pnpm-cache-path
        run: |
          echo "STORE_PATH=$(pnpm store path)" >> $GITHUB_OUTPUT

      - name: pnpm cache
        uses: actions/cache@v3
        with:
          path: ${{ steps.pnpm-cache-path.outputs.STORE_PATH }}
          key: ${{ runner.os }}-${{ steps.setup-node.outputs.node-version }}-pnpm-store-${{ hashFiles('**/pnpm-lock.yaml') }}
          restore-keys: |
            ${{ runner.os }}-${{ steps.setup-node.outputs.node-version }}-pnpm-store-

      - run: pnpm install --aggregate-output --reporter append-only --frozen-lockfile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request to improve the current logic needs eyes
Projects
None yet
Development

No branches or pull requests

5 participants
@dsame @maximveksler @allbetter-max @dusan-trickovic and others