-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resource requests should be at executor level. #208
Comments
Executors are run meant to be run sequentially on a single machine allocated to a task. In most deployments, the TES service allocates a machine (either VM or HPC node) starts up a runner, the runner moves all files into place and then invokes each executor one after the other. AWS charges you for the full VM for the full time that you use it, even if you are only using half of it for some of the processes. The only way to change the allocation size is to have request another sized VM and move the tasks there, which would be equivalent to issuing two different tasks. Same with HPC systems, like SLURM. |
Thanks for the context @kellrott, and I think it does make sense for the specific instances you bring up here. Additionally, I knew the part about the executors running sequentially in order from this part of the spec:
That being said, the idea that executors were intended to run on a single machine is new (at least to me from my reading of the documentation). That might be something to clarify in the spec if it isn't already and I just missed it. Above notwithstanding, I still think it makes sense to consider this change for situations where:
In my mind, specifying the resources at the executor level would subsume the use cases you listed (e.g., by looking across all executors before requisition and picking the maximum resource usage), while servicing the cases I list above aren't possible in the current state of affairs. |
It seems strange that resource requests are specified at a task level instead of a executor level where images are actually specified. The lack of flexibility surrounding resource allocation at this level greatly inhibits one of the major potential benefits (the biggest potential benefit?) that the executors abstraction provides—you can't save on resources for commands that don't require a huge amount of CPU/RAM/disk.
The text was updated successfully, but these errors were encountered: