-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: use results dir as working directory for decorator jobs #743
Conversation
Codecov ReportAll modified lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## feature/job-decorator #743 +/- ##
=======================================================
Coverage 100.00% 100.00%
=======================================================
Files 129 129
Lines 8387 8387
Branches 1866 1866
=======================================================
Hits 8387 8387
☔ View full report in Codecov by Sentry. |
os.chdir(get_results_dir()) | ||
|
||
# create symlinks to input data | ||
link_input() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we prevent the input data from getting populated in the results? Since the "recovered()" function encapsulates the entirety of the job script, could we unlink the input data in our template after that function finishes execution? This would ensure that the function has had a chance to consume the input data, but remove the input data before the results are packaged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we can add that. I can make sure that each link is a symlink and points to the original destination before deleting, in case it has been overwritten explicitly by the user.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we update tests to assert this behavior?
# load and run serialized entry point function | ||
recovered = cloudpickle.loads({serialized}) | ||
def {function_name}(): | ||
result = recovered() | ||
if result is not None: | ||
save_job_result(result, data_format=PersistedJobDataFormat.PICKLED_V4) | ||
# clean_links(links) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be commented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, that was just me testing default behavior
Yes, I'm working on the tests currently |
) | ||
with open(Path(job.name, "output_file.txt"), "r") as f: | ||
assert f.read() == "hello" | ||
os.chdir(current_dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the test fails before this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. Copied this pattern from above, so I'll update in both places. The reason for switching the cwd is because download_result downloads to the cwd without an option to specify another location
Issue #, if available:
Description of changes:
use results dir as working directory for decorator jobs
Testing done:
Unit tests and manual verification of functionality. Integ tests passing and updated to test new functionality
Merge Checklist
Put an
x
in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your pull request.General
Tests
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.