Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite expect script in C to avoid multiple issues with shell #341

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

richiejp
Copy link
Contributor

The shell expect script tried to repeatedly capture the tmux pane and match the output until either it matched or it timed out.

This had a large number of issues between the platforms with it finally working on my machine and in CI but failing on @liquadity's setup.

This is a bit extreme, but unreliable and unpredictable testing is costing and time and I want to firmly squash it. Hopefully the C code will be easier to manage and will do what is expected.

@richiejp
Copy link
Contributor Author

@liquidaty very sorry about the issues with testing. Could you try running the tests on this branch and take a look at scripts/test-expect.sh and utils/expect.c to see if this now functions correctly and makes more sense?

I haven't moved the shell script into the Makefile because it is used in more than one Makefile, but I have commented it.

@liquidaty
Copy link
Owner

liquidaty commented Dec 12, 2024

Thank you @richiejp,

When I build and run from scratch as follows:

rm -rf build
make -C src clean install
make -C app clean test install
make -C app/ext_example clean test

I still get the below error:

test-sheet-extension-1:
Extension my registered
Exiting extension example!
../../scripts/test-expect.sh: line 29: /path/to/zsv/build/Darwin/dbg/cc/bin/test-expect: No such file or directory
test-sheet-extension-1-indexed did not match
test-sheet-extension-1: Failed!

Tried in both bash and sh, with and without DEBUG=1, same outcome each time. Any suggestions?

@liquidaty
Copy link
Owner

I have identified at least one issue: it is looking for build/Darwin/rel/cc/bin/test-expect, but that file does not exist, and actually it should be looking for build/Darwin/rel/gcc-13/bin/test-expect

So it appears the issue is that the CC variable value used when test-expect is built, is assumed to be, but in reality might not be, the same as the CC variable value used when the test is run.

Maybe we should consolidate the configure scripts so that there is only one-- but that is a separate issue.

Based on the above, as expected, if I make both config.mk's identical (./ and ./app, test-sheet-extension-1 passes. Not sure though why the next test fails with the below, but this seems like a more diagnosable issue.

test-sheet-extension-2:
no server running on /private/tmp/tmux-501/default
No tmux session to kill
Extension my registered
Exiting extension example!
test-sheet-extension-2-indexed took 0.19
test-sheet-extension-2-transformed took 0.01
Timeout expired after 5.01 seconds
Last output that failed to match:
Row #                   HA1 HA2 HA3              HB1  HB2  HB3           HC1  HC2  HC3          Column count
1                       A1                       B1                      C1                     6
2                       A2                       B2                      C2                     9
3                       A3                       B3                      C3                     12
? for help Row #

test-sheet-extension-2 did not match
test-sheet-extension-2: Failed!

@richiejp
Copy link
Contributor Author

So it appears the issue is that the CC variable value used when test-expect is built, is assumed to be, but in reality might not be, the same as the CC variable value used when the test is run.

sorry after seeing this I realised I in fact hard coded gcc in the Makefile instead of writing ${CC}.

Not sure though why the next test fails with the below, but this seems like a more diagnosable issue.

Looking at the output it performed the transformation and indexing, but then it should scroll to the bottom when tmux send-keys -5 $@ G is called. However it has stayed at the top.

This could be due to an error when seeking from the index, so I added an assert as a quick and temporary way to confirm or deny that if you get a chance to quickly try running it with DEBUG=1.

@richiejp richiejp mentioned this pull request Dec 14, 2024
14 tasks
@liquidaty
Copy link
Owner

Is there any way to remove the dependency on the timeout utility? Perhaps by replacing it with a simple bash script?

@richiejp
Copy link
Contributor Author

Somewhat simple if we sacrifice precision. Arithmetic in shell is limited, so initially I got this which contains awk, but there is also the issue that I'm using %N to get nanoseconds from date which is not in POSIX.

now_seconds() {
  printf "%.2f" "$(date +%s.%N)"
}

start_time=$(now_seconds)

while true; do
  tmux capture-pane -t "$TARGET" -p > "$CAPTURE"
  t=$(now_seconds)

  if ${CMP} -s "$CAPTURE" "$EXPECTED"; then
    matched=true
    break
  fi

  timedout=$(echo "$start_time" "$t" "$EXPECT_TIMEOUT" | awk '{ if ($2 - $1 > $3) print "true" }')
  if [ "$timedout" = "true" ] ; then
    break
  fi

  sleep 0.025
done

elapsed=$(echo "$start_time" "$t" | awk '{ printf "%.2f", $2 - $1 }')

The last remaining option is to limit the precision to whole seconds. I wanted to keep the precision for recording the timings. For the timeout it doesn't matter.

now_seconds() {
  date +%s
}

start_time=$(now_seconds)

while true; do
  tmux capture-pane -t "$TARGET" -p > "$CAPTURE"
  t=$(now_seconds)

  if ${CMP} -s "$CAPTURE" "$EXPECTED"; then
    matched=true
    break
  fi

  if [ $(( t - start_time )) -gt "$EXPECT_TIMEOUT" ]; then
    break
  fi

  sleep 0.025
done

elapsed=$(( t - start_time ))

if [ "$matched" = "true" ]; then
  echo "$TARGET$STAGE took ${elapsed}s"
  printf ", %s" "$elapsed" >> "${TIMINGS_CSV}"
fi

full changes: https://github.com/liquidaty/zsv/compare/main...richiejp:expect-sans-timeout?expand=1

@liquidaty
Copy link
Owner

Thanks @richiejp. Could you pls remind me: what was the original problem that the expect script is solving? If it is just the intermittent test failures, could we not just far more simply address by the combination of sleeping for a longer time and in the case of comparing timestamps, broadening the window of acceptable timestamp values? I'm wondering if this added complexity is worth it, compared to how the tests used to work which was very simple

@richiejp
Copy link
Contributor Author

Good question, yes it is about stopping intermittent test failures, but without increasing the time it takes to do testing. I found on some steps the sleep needed to be 1-3 seconds at least to avoid random failures. Meanwhile these screens take 0.01 to 0.3 seconds to match on average.

I guess this is partially a matter of personal preference, but I find it quite disruptive if a process takes a few minutes and it'll only get worse as more tests are added. In fact it will get exponentially worse because the longer the test suite runs, the more likely you are to hit an outlier (e.g. when the VM is interrupted in CI) and you have to increase the sleeps. I saw this a lot when working in QA.

As for comparing timestamps; yes I agree we can just lower the precision.

There are some other things which the script/utility do which increase the complexity a bit. For example it allows matching intermediate screens so that I could test multiple steps. These could be handled another way, but would still increase the complexity of any solution or we'd just have to have less comprehensive/observable testing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants