Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add valgrind tests #188

Merged
merged 7 commits into from
Feb 1, 2021
Merged

Add valgrind tests #188

merged 7 commits into from
Feb 1, 2021

Conversation

onurctirtir
Copy link
Member

This pr aims to:

  • migrate valgrind tests from tools repository to test-automation repository
  • automate weekly valgrind tests on circle ci

Replaces #130

TODOs after this pr is merged:

  • Remove tooling from tools repo (or deprecate it)
  • Valgrind test results should be pushed to a single branch #170
  • Putting branch name/sha of the HEAD and pg version to the valgrind test results
  • Research github api to understand success of last valgrind test
  • Research ways to automate custom tests (maybe pushing branches and removing them automatically ?)

@onurctirtir onurctirtir self-assigned this Jul 3, 2020
@onurctirtir onurctirtir force-pushed the valgrind-merge branch 3 times, most recently from a47fef5 to 1aa7193 Compare July 3, 2020 22:59
Copy link
Contributor

@SaitTalhaNisanci SaitTalhaNisanci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember talking about the design of this PR a couple of months ago with you. We now have an improved structure in our test automation, with hammerdb tool I had the same too long to execute in one run problem for CI and I implemented that using screen command. Cleaning up resource groups from a VM on azure was also a problem but for that I added a job to test automation and we can delete a resource group by pushing to a branch with the resource group name, there is already a utility script for that https://github.com/citusdata/test-automation/blob/master/hammerdb/delete-resource-group.sh, so calling delete-resource-group.sh ${rg} will delete the resource group.

With those improvements I am wondering if that makes sense to use something similar for valgrind tests, I know that this might mean a big change in the current structure and I am not sure if it is worth that when we already have a working structure but it will make things simpler and we will only have one job instead of two. In the past there were some uncertain parts with the usage of detached session but now I already solved them so we can leverage that. Maybe we should think about this a bit, but the current PR is also working so we can use the current approach as well.

Another thing to make sure is that other jobs should still work correctly with this changeset, we can do this test right before merging

.circleci/config.yml Outdated Show resolved Hide resolved
.circleci/config.yml Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
sh $HOME/test-automation/azure/push-results.sh <branch_name_to_push>
```

Finally, delete your resource group.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mention for the periodic run?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I couldn't quite understand

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I meant should we mention that there is a periodic run for valgrind tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh now I understood, will apply that too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ that destroy job isn't actually executed automatically when user follows manual steps, so I think we should not mention this here ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adressed with last commit

azure/run-all-tests.sh Show resolved Hide resolved
azure/run-all-tests.sh Outdated Show resolved Hide resolved
fabfile/run.py Show resolved Hide resolved
azure/push-results.sh Show resolved Hide resolved
azure/citus-bot.sh Outdated Show resolved Hide resolved
@onurctirtir onurctirtir force-pushed the valgrind-merge branch 3 times, most recently from 9691e67 to 96fb46e Compare January 29, 2021 12:23
# will immediately be closed just after the fabric command is run
#
# We have a seperate job to terminate the machine and push the results
if [ "$rg_name" = "citusbot_valgrind_test_resource_group" ]; then
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't seem to be the correct way of checking

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use "==" ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, they are very same when comparing strings

@onurctirtir onurctirtir force-pushed the valgrind-merge branch 4 times, most recently from caf0fc5 to 22ac09a Compare January 29, 2021 16:09
Copy link
Contributor

@SaitTalhaNisanci SaitTalhaNisanci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also create an issue to use a new structure for valgrind test. Something like:

"Valgrind job could use detached test structure so that we wouldn't need to have 2 jobs and things would be more clean. This wasn't trivial so we decided to merge the working PR as is"

@@ -97,3 +131,30 @@ workflows:
only:
- /tpch\/.*/ # match with tpch/ prefix
- /all_performance_test\/.*/ # match with all_performance_test/ prefix

# perform weekly valgrind test on azure every monday at 00:00
# https://crontab.guru/#0_0_*_*_1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: This could be put right above "cron" field so that it is closer to where it is relevant.

README.md Outdated
./connect.sh

# run fab command in coordinator in a detachable session
sudo yum install tmux
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have tmux in our servers by default now:
https://github.com/citusdata/test-automation/blob/master/azure/init.sh#L24


if [ "$rg" == "citusbot_valgrind_test_resource_group" ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We kind of want to do the cleanup if it fails while creating the cluster I think, but we need the VALGRIND_TEST variable set before running ./create-cluster.sh

We can possibly do the following:

if [ "$rg" == "citusbot_valgrind_test_resource_group" ]; then
    # If running valgrind tests, export VALGRIND_TEST to be 1 to ensure
    # only coordinator instance is created in create-cluster script
    export VALGRIND_TEST=1
fi

./create-cluster.sh
if <VALGRIND_TEST variable = 1> ; then
    # If running valgrind tests, do not run cleanup function
    # This is because, as valgrind tests requires too much time to run,
    # we start valgrind tests via nohup in ci. Hence ssh session
    # will immediately be closed just after the fabric command is run
    trap - EXIT
fi
 

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes a lot sense, thanks

@onurctirtir onurctirtir merged commit 83dbf70 into master Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants