-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add valgrind tests #188
Add valgrind tests #188
Conversation
a47fef5
to
1aa7193
Compare
1aa7193
to
bd537a2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember talking about the design of this PR a couple of months ago with you. We now have an improved structure in our test automation, with hammerdb tool I had the same too long to execute in one run
problem for CI and I implemented that using screen command. Cleaning up resource groups from a VM on azure was also a problem but for that I added a job to test automation and we can delete a resource group by pushing to a branch with the resource group name, there is already a utility script for that https://github.com/citusdata/test-automation/blob/master/hammerdb/delete-resource-group.sh, so calling delete-resource-group.sh ${rg}
will delete the resource group.
With those improvements I am wondering if that makes sense to use something similar for valgrind tests, I know that this might mean a big change in the current structure and I am not sure if it is worth that when we already have a working structure but it will make things simpler and we will only have one job instead of two. In the past there were some uncertain parts with the usage of detached session but now I already solved them so we can leverage that. Maybe we should think about this a bit, but the current PR is also working so we can use the current approach as well.
Another thing to make sure is that other jobs should still work correctly with this changeset, we can do this test right before merging
sh $HOME/test-automation/azure/push-results.sh <branch_name_to_push> | ||
``` | ||
|
||
Finally, delete your resource group. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we mention for the periodic run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I couldn't quite understand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I meant should we mention that there is a periodic run for valgrind tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh now I understood, will apply that too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+ that destroy job isn't actually executed automatically when user follows manual steps, so I think we should not mention this here ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adressed with last commit
12b0afa
to
babb29e
Compare
9691e67
to
96fb46e
Compare
# will immediately be closed just after the fabric command is run | ||
# | ||
# We have a seperate job to terminate the machine and push the results | ||
if [ "$rg_name" = "citusbot_valgrind_test_resource_group" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't seem to be the correct way of checking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should use "==" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, they are very same when comparing strings
caf0fc5
to
22ac09a
Compare
649d3f4
to
c1786a3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also create an issue to use a new structure for valgrind test. Something like:
"Valgrind job could use detached test structure so that we wouldn't need to have 2 jobs and things would be more clean. This wasn't trivial so we decided to merge the working PR as is"
.circleci/config.yml
Outdated
@@ -97,3 +131,30 @@ workflows: | |||
only: | |||
- /tpch\/.*/ # match with tpch/ prefix | |||
- /all_performance_test\/.*/ # match with all_performance_test/ prefix | |||
|
|||
# perform weekly valgrind test on azure every monday at 00:00 | |||
# https://crontab.guru/#0_0_*_*_1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: This could be put right above "cron" field so that it is closer to where it is relevant.
README.md
Outdated
./connect.sh | ||
|
||
# run fab command in coordinator in a detachable session | ||
sudo yum install tmux |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have tmux in our servers by default now:
https://github.com/citusdata/test-automation/blob/master/azure/init.sh#L24
|
||
if [ "$rg" == "citusbot_valgrind_test_resource_group" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We kind of want to do the cleanup if it fails while creating the cluster I think, but we need the VALGRIND_TEST variable set before running ./create-cluster.sh
We can possibly do the following:
if [ "$rg" == "citusbot_valgrind_test_resource_group" ]; then
# If running valgrind tests, export VALGRIND_TEST to be 1 to ensure
# only coordinator instance is created in create-cluster script
export VALGRIND_TEST=1
fi
./create-cluster.sh
if <VALGRIND_TEST variable = 1> ; then
# If running valgrind tests, do not run cleanup function
# This is because, as valgrind tests requires too much time to run,
# we start valgrind tests via nohup in ci. Hence ssh session
# will immediately be closed just after the fabric command is run
trap - EXIT
fi
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes a lot sense, thanks
final reviews
This pr aims to:
Replaces #130
TODOs after this pr is merged: