Sharing executor resources among several electrons #943

cjao · 2022-07-29T12:55:58Z

cjao
Jul 29, 2022
Maintainer

Remote executors often have significant startup delays, whether it is to spin up a virtual machine in the cloud or boot a bare-metal Slurm compute node. One would like the ability to not only allocate multiple tasks to the same executor instance, but also keep the underlying hardware resources alive until the last assigned task completes. This would amortize the provisioning costs over multiple task runs.

Note: the present situation is actually slightly worse than before PR #754. The SDK now sends a JSON representation of each instance, and the the server constructs a new executor plugin instance for each task from the JSON data. Previously, users could assign the same executor plugin instance to multiple electrons, and the instances themselves would be pickled over to the dispatcher. This at least allowed the same plugin instance to be shared by multiple tasks, although there was still no mechanism to reserve the executor's underlying resources for multiple tasks.

Assumption: when users specify an executor plugin instance during workflow construction (as opposed to a mere "short name" like "dask" or "local"), they intend for all electrons decorated with that instance to use the same resources exposed by that instance (e.g. same set of Slurm nodes).

Proposed changes:

The JSON representation of an executor instance shall include an instance id, which can be used to group tasks server-side by executor instance.
Each executor plugin has two new methods: setup() and teardown/cleanup() (@venkatBala :) ). The setup() method is invoked upon instantiation to provision resources, such as by spinning up a EC2 instance or reserving Slurm nodes.
The dispatcher shall maintain a cache of shared executors instances. After an executor is constructed, the next task that refers to that instance simply retrieves the instance from the executor cache.
Each executor instance tracks the number of tasks that have been assigned to it. Each invocation of execute() decrements the counter before returning; when the counter reaches 0, the cleanup() method is invoked to release the resources.

cjao · 2022-07-29T12:56:05Z

cjao
Jul 29, 2022
Maintainer Author

@santoshkumarradha @kessler-frost

2 replies

FyzHsn Jul 29, 2022
Maintainer

Remote executors often have significant startup delays, whether it is to spin up a virtual machine in the cloud or boot a bare-metal Slurm compute node. One would like the ability to not only allocate multiple tasks to the same executor instance, but also keep the underlying hardware resources alive until the last assigned task completes. This would amortize the provisioning costs over multiple task runs.

Note: the present situation is actually slightly worse than before PR #754. The SDK now sends a JSON representation of each instance, and the the server constructs a new executor plugin instance for each task from the JSON data. Previously, users could assign the same executor plugin instance to multiple electrons, and the instances themselves would be pickled over to the dispatcher. This at least allowed the same plugin instance to be shared by multiple tasks, although there was still no mechanism to reserve the executor's underlying resources for multiple tasks.

Assumption: when users specify an executor plugin instance during workflow construction (as opposed to a mere "short name" like "dask" or "local"), they intend for all electrons decorated with that instance to use the same resources exposed by that instance (e.g. same set of Slurm nodes).

Proposed changes:

The JSON representation of an executor instance shall include an instance id, which can be used to group tasks server-side by executor instance.

Each executor plugin has two new methods: setup() and teardown/cleanup() (@venkatBala :) ). The setup() method is invoked upon instantiation to provision resources.

The dispatcher shall maintain a cache of executors. After an executor is constructed, the next task that refers to that instance simply retrieves the instance from the executor cache.

What if multiple electrons that are able to run parallel, are now forced to run in sequence because they're using the same executor?

Each executor instance tracks the number of tasks that have been assigned to it. Each invocation of execute() decrements the counter before returning; when the counter reaches 0, the cleanup() method is invoked to release the resources.

cjao Jul 29, 2022
Maintainer Author

What if multiple electrons that are able to run parallel, are now forced to run in sequence because they're using the same executor?

Hi @FyzHsn, the same executor instance would still be able to run multiple tasks concurrently so long as the underlying hardware resources support the parallelism. The point is that Executor.execute() would be invoked from different asyncio tasks (or threadpool threads, depending on whether the executor is async-aware).

FyzHsn · 2022-07-29T13:54:58Z

FyzHsn
Jul 29, 2022
Maintainer

@cjao @venkatBala (In response to the setup and teardown methods) How about adding a resource class instead of setup and teardown method added to the Executor class? Does it make sense to keep infrastructure provisioning and electron execution separate? For example, let's consider the AWS BatchExecutor.

Maybe we can add a AWSBatchResource class to the plugin.

class AWSBatch:

def __init__(self, ...):
    self.batch_resource = AWSBatchResource(ecr_name="cova_ecr", log_group_name="covalent-logs", num_cpu=2, mem=3.75, ...)
    self.batch_resource.provision()

2 replies

cjao Jul 29, 2022
Maintainer Author

I think it depends on whether we want provisioning to occur "just-in-time" -- right before the first task is to be run in that executor -- or ahead-of-time, upon workflow submission. In the former case it makes sense for provisioning to be coupled to execution, while for the latter the resources would be externally managed by the dispatcher.

cjao Jul 29, 2022
Maintainer Author

Actually ahead-of-time is really just a special case of just-in-time -- one can create an trivial electron with that executor just to provision the resources at the beginning.

cjao · 2022-07-29T14:09:30Z

cjao
Jul 29, 2022
Maintainer Author

Tagging @Emmanuel289 as well.

0 replies

santoshkumarradha · 2022-07-29T14:34:56Z

santoshkumarradha
Jul 29, 2022
Maintainer

@FyzHsn , @cjao the proposed UX is given here. https://www.notion.so/Sharable-common-Executers-45634f68265d4972a81e63cb796f1e91.

Quoting it
We want a special class of executers that makes sure that all the electrons with same instance are run in the same machine and not just type of machine. For instance If I have 50 different small electrons, I want to open a ec2 instance and run all 50 electrons in the same ec2 before shutting it down.

UX could be something like

executer=ct.executers.AWSEC2Executer(ncores=2)
e1=executer.get_shared_instance()

e2=executer.get_shared_instance()

@ct.electron(executer=e1)
def a1():...

@ct.electron(executer=e1)
def a2():...

@ct.electron(executer=executer)
def a3():...

@ct.electron(executer=e2)
def a4():...

@ct.electron(executer=e2)
def a5():...

@ct.lattice
def workflow():
	r=a1()
	r=a3(r)
	r=a2(r)
        r=a3(r)
        r=a4(r)
        r=a5(r)
	return r

Even though a2 happens after a3, since a1 and a2 share an instance of EC2 executer, the same EC2 that is created for a1 is kept alive until node having a2 is reached. And for e2, a new shared instance (different than e1) is being used to being alive for a4 and a5.

Note that this would be true for all executers. For instance, for executers where there is no concept of spin up or shutdown (SSH executers, local executer etc..) there is no concept of "needing to wait for all electrons to complete"

@FyzHsn this would alleviate your concern on forcing things to run serially. We want the user to specify if they want to run things in the same instance or use a different. We do this by giving them an option to pass typed instances of executers.

3 replies

FyzHsn Jul 30, 2022
Maintainer

Aha got it!

cjao Jul 30, 2022
Maintainer Author

For instance, for executers where there is no concept of spin up or shutdown (SSH executers, local executer etc..) there is no concept of "needing to wait for all electrons to complete"

Actually (@venkatBala ), for the pure SSH executor it might be useful for multiple executions to share the same SSH connection.

venkatBala Jul 30, 2022

@cjao agreed... limiting the number of new SSH connections opened per electron execution would be great. Reusing a open connection would also go to reduce initial startup/connection time if multiple electrons are to be executed on the same remote machine/EC2instance

kessler-frost · 2022-07-29T16:45:55Z

kessler-frost
Jul 29, 2022
Maintainer

Slightly unrelated benefit of this will be that we will be able to use the same Dask client for multiple different electrons. Hence the code there will simplify and we will be able to get rid of the global client dictionary.

0 replies

cjao · 2022-07-30T11:01:41Z

cjao
Jul 30, 2022
Maintainer Author

The above ideas are mocked up in this branch. Toy setup() and cleanup() methods were implemented for the local and dask executors by simply logging to the covalent logs. If you dispatch the following toy workflow

import covalent as ct
from covalent.executor import DaskExecutor
from dask.distributed import LocalCluster
dask_exec = DaskExecutor(lc.scheduler_address)

@ct.electron(executor=dask_exec)
def task(x):
    return x

@ct.electron(executor="local")
def local_task(x):
    return x

@ct.lattice(workflow_executor=dask_exec)
def workflow(x):
    res1 = task(x)
    res2 = local_task(res1)
    res3 = task(res2)
    return 1

from a jupyter notebook, then the logs show that

the first and second instances of task share the same Dask executor instance
the Dask executor calls setup before the first task and cleans up only after the second task despite the intervening local_task. Acutally, since we've assigned postprocessing to that same instance, it only cleans up after postprocessing finishes (that's why the logs show 3 tasks initially).

> covalent logs | grep -E "Dask|Local|local"

DEBUG - Running task task using executor dask, {'type': "<class '/var/home/casey/Agnostiq/code/covalent/covalent/executor/executor_plugins/dask.DaskExecutor'>", 'short_name': 'dask', 'attributes': {'log_stdout': 'stdout.log', 'log_stderr': 'stderr.log', 'conda_env': '', 'cache_dir': '/var/home/casey/.cache/covalent', 'current_env_on_conda_fail': False, 'current_env': '', 'instance_id': 139925591447728, 'warmed_up': False, 'tasks_left': 1, 'scheduler_address': 'tcp://127.0.0.1:36095'}}
DEBUG - Dask executor 139925591447728: provisioning resources
DEBUG - Dask executor 139925591447728 has 3 tasks left
DEBUG - 7A: Node name: local_task (run_planned_workflow).
DEBUG - Running task local_task using executor local, {}
DEBUG - Executing task local_task
local.py: Line 60 in setup:
DEBUG - Local executor 140358950089056: provisioning resources
local.py: Line 66 in run:
DEBUG - Local executor 140358950089056: 1 tasks left
local.py: Line 67 in run:
DEBUG - Running function functools.partial(<function wrapper_fn at 0x7fa7e2f5f940>, <covalent.TransportableObject object at 0x7fa7dd5c75e0>, [(<covalent.TransportableObject object at 0x7fa7dd043d90>, <covalent.TransportableObject object at 0x7fa7dd0435b0>, <covalent.TransportableObject object at 0x7fa7dd043970>, '')], []) locally
local.py: Line 63 in cleanup:
DEBUG - Local executor 140358950089056: relinquishing resources
DEBUG - Running task task using executor dask, {'type': "<class '/var/home/casey/Agnostiq/code/covalent/covalent/executor/executor_plugins/dask.DaskExecutor'>", 'short_name': 'dask', 'attributes': {'log_stdout': 'stdout.log', 'log_stderr': 'stderr.log', 'conda_env': '', 'cache_dir': '/var/home/casey/.cache/covalent', 'current_env_on_conda_fail': False, 'current_env': '', 'instance_id': 139925591447728, 'warmed_up': False, 'tasks_left': 1, 'scheduler_address': 'tcp://127.0.0.1:36095'}}
DEBUG - Dask executor 139925591447728 has 2 tasks left
DEBUG - Submitted post-processing job to executor ['dask', {'type': "<class '/var/home/casey/Agnostiq/code/covalent/covalent/executor/executor_plugins/dask.DaskExecutor'>", 'short_name': 'dask', 'attributes': {'log_stdout': 'stdout.log', 'log_stderr': 'stderr.log', 'conda_env': '', 'cache_dir': '/var/home/casey/.cache/covalent', 'current_env_on_conda_fail': False, 'current_env': '', 'instance_id': 139925591447728, 'warmed_up': False, 'tasks_left': 1, 'scheduler_address': 'tcp://127.0.0.1:36095'}}] at 2022-07-30 10:47:29.242736+00:00
DEBUG - Running task post_process using executor dask, {'type': "<class '/var/home/casey/Agnostiq/code/covalent/covalent/executor/executor_plugins/dask.DaskExecutor'>", 'short_name': 'dask', 'attributes': {'log_stdout': 'stdout.log', 'log_stderr': 'stderr.log', 'conda_env': '', 'cache_dir': '/var/home/casey/.cache/covalent', 'current_env_on_conda_fail': False, 'current_env': '', 'instance_id': 139925591447728, 'warmed_up': False, 'tasks_left': 1, 'scheduler_address': 'tcp://127.0.0.1:36095'}}
DEBUG - Dask executor 139925591447728 has 1 tasks left
DEBUG - Dask executor 139925591447728: Relinquishing resources

0 replies

cjao · 2022-08-01T12:43:40Z

cjao
Aug 1, 2022
Maintainer Author

Some UX questions that boil down to the appropriate choice of defaults:

Is the original assumption overly simplistic?

when users specify an executor plugin instance during workflow construction (as opposed to a mere "short name" like "dask" or "local"), they intend for all electrons decorated with that instance to use the same resources exposed by that instance (e.g. same set of Slurm nodes).

This tightly couples an executor instance to the computational resource it represents: one executor instance, one resource. So for example, there would be a one-to-one correspondence between instances of AWSEC2Executer(ncores=2) and instances of 2-core VMs.

One concern raised by @santoshkumarradha is that in the common scenario where users do want to segregate resources between electrons, they would need to construct a new executor plugin instance which could be unwieldy when many parameters are required to initialize the plugin. I have tried to mitigate this by introducing a Executor.clone() method, so that by writing

e1=ct.executers.AWSEC2Executer(ncores=2)
e2 = e1.clone() # Same credentials and machine specifications as e1 but different instance_id

electrons using e1 and e2 would execute in different 2-core VM instances. However, this would still require a clone() statement for each new electron that wants to run in a private hardware instance like

ct.electron(executor=e1.clone())
def task_1():
...

ct.electron(executor=e1.clone())
def task_2():
...

The original UX specification

executer=ct.executers.AWSEC2Executer(ncores=2)
e1=executer.get_shared_instance()
e2=executer.get_shared_instance()

makes it easier to allocate exclusive hardware resources to each electron. In this case, two electrons that specify the same executor instance would not share hardware unless the instance is a "shared instance" obtained through get_shared_instance(). In this scenario, how should users think of executor instances? What would an executor instance actually represent?

0 replies

cjao · 2022-08-01T23:37:23Z

cjao
Aug 1, 2022
Maintainer Author

As discussed in Slack, binding an executor plugin instance to a hardware instance would fail to handle the use case where the same electron is to be run multiple times in parallel in different hardware instances. Accordingly, the get_shared_instance() UX has been implemented. The revised demo script would then be

import covalent as ct
from covalent.executor import DaskExecutor
from dask.distributed import LocalCluster
dask_exec = DaskExecutor(lc.scheduler_address).get_shared_instance()

@ct.electron(executor=dask_exec)
def task(x):
    return x

@ct.electron(executor="local")
def local_task(x):
    return x

@ct.lattice(workflow_executor=dask_exec)
def workflow(x):
    res1 = task(x)
    res2 = local_task(res1)
    res3 = task(res2)
    return 1

which yields similar results

DEBUG - Running task task using executor dask, {'type': "<class '/var/home/casey/Agnostiq/code/covalent/covalent/executor/executor_plugins/dask.DaskExecutor'>", 'short_name': 'dask', 'attributes': {'log_stdout': 'stdout.log', 'log_stderr': 'stderr.log', 'conda_env': '', 'cache_dir': '/var/home/casey/.cache/covalent', 'current_env_on_conda_fail': False, 'current_env': '', 'instance_id': 139777301508880, 'warmed_up': False, 'tasks_left': 1, 'scheduler_address': 'tcp://127.0.0.1:40915'}}
DEBUG - Dask executor 139777301508880: provisioning resources
DEBUG - Dask executor 139777301508880 has 3 tasks left
DEBUG - 7A: Node name: local_task (run_planned_workflow).
DEBUG - Running task local_task using executor local, {}
DEBUG - Executing task local_task
local.py: Line 60 in setup:
DEBUG - Local executor 0: provisioning resources
local.py: Line 66 in run:
DEBUG - Local executor 0: 1 tasks left
local.py: Line 67 in run:
DEBUG - Running function functools.partial(<function wrapper_fn at 0x7f88d2e76940>, <covalent.TransportableObject object at 0x7f88cd1c66d0>, [(<covalent.TransportableObject object at 0x7f88ccc45ee0>, <covalent.TransportableObject object at 0x7f88ccc45d30>, <covalent.TransportableObject object at 0x7f88ccc45640>, '')], []) locally
local.py: Line 63 in cleanup:
DEBUG - Local executor 0: relinquishing resources
DEBUG - Running task task using executor dask, {'type': "<class '/var/home/casey/Agnostiq/code/covalent/covalent/executor/executor_plugins/dask.DaskExecutor'>", 'short_name': 'dask', 'attributes': {'log_stdout': 'stdout.log', 'log_stderr': 'stderr.log', 'conda_env': '', 'cache_dir': '/var/home/casey/.cache/covalent', 'current_env_on_conda_fail': False, 'current_env': '', 'instance_id': 139777301508880, 'warmed_up': False, 'tasks_left': 1, 'scheduler_address': 'tcp://127.0.0.1:40915'}}
DEBUG - Dask executor 139777301508880 has 2 tasks left
DEBUG - Submitted post-processing job to executor ['dask', {'type': "<class '/var/home/casey/Agnostiq/code/covalent/covalent/executor/executor_plugins/dask.DaskExecutor'>", 'short_name': 'dask', 'attributes': {'log_stdout': 'stdout.log', 'log_stderr': 'stderr.log', 'conda_env': '', 'cache_dir': '/var/home/casey/.cache/covalent', 'current_env_on_conda_fail': False, 'current_env': '', 'instance_id': 139777301508880, 'warmed_up': False, 'tasks_left': 1, 'scheduler_address': 'tcp://127.0.0.1:40915'}}] at 2022-08-01 23:22:08.675047+00:00
DEBUG - Running task post_process using executor dask, {'type': "<class '/var/home/casey/Agnostiq/code/covalent/covalent/executor/executor_plugins/dask.DaskExecutor'>", 'short_name': 'dask', 'attributes': {'log_stdout': 'stdout.log', 'log_stderr': 'stderr.log', 'conda_env': '', 'cache_dir': '/var/home/casey/.cache/covalent', 'current_env_on_conda_fail': False, 'current_env': '', 'instance_id': 139777301508880, 'warmed_up': False, 'tasks_left': 1, 'scheduler_address': 'tcp://127.0.0.1:40915'}}
DEBUG - Dask executor 139777301508880 has 1 tasks left
DEBUG - Dask executor 139777301508880: Relinquishing resources

The special value instance_id=0 indicates that the instance is not to be shared and shall be discarded after a single task.

Moreover, here is a revised conceptual framework for thinking about executors plugins, and the the hardware resources they represent, from the client's point of view:

Executor instance: a computational resource type or specification
Shared executor instance: a concrete computational resource conforming to the type

0 replies

cjao · 2022-09-09T19:38:05Z

cjao
Sep 9, 2022
Maintainer Author

One potential issue is that the semantics of setup() and teardown() need to be clarified. As I understand, they were introduced to perform executor-specific provisioning and deprovisioning tasks. If that is the case, the resources allocated should depend only on the specifications passed to the executor constructor (e.g. EC2 instance type, ncores, etc) and on not a specific task.

In the AWSLambda executor, however, setup() embeds task-specific data, such as the filename of the pickled function, into the resources. This will not work in a shared setting. Would these steps belong inside the run() method? If not, the executor will need task-independent methods to provision and deprovision resources. Perhaps setup() and teardown() shouldn't have any parameters?

The EC2 executor fares better. There, setup() and teardown() are just invoking Terraform, and the dispatch_id/node_id are just used to name the tfstate file. The tfstate file could just as well be named after the executor's instance_id without losing any information because the dispatcher knows the mapping between node_id and Executor.instance_id.

5 replies

santoshkumarradha Sep 12, 2022
Maintainer

This is a good point.

santoshkumarradha Sep 12, 2022
Maintainer

@cjao i agree with the pattern where it should be independent, but in humoring it for a second, even if tap dependent data is used inside the setup/tear down the only technical constrain is that the namespace arguments of the functional kwargs must be clean and universal to be accessed by the dispatcher, is that correct?

cjao Sep 12, 2022
Maintainer Author

even if tap dependent data is used inside the setup/tear down the only technical constrain is that the namespace arguments of the functional kwargs must be clean and universal to be accessed by the dispatcher, is that correct?

This is indeed the main technical constraint. Another constraint is that teardown() needs to identify the resources allocated by setup(). If the resources are tagged with the dispatch_id/node_id of the first task that runs in the executor instance, that information will need to be persisted in the executor's runtime state for teardown() to retrieve possibly many tasks later.

Perhaps one can store references to all provisioned resources in a state variable like self._resources and retrieving that data structure in teardown(). For example, the tfstate filename could be stored in self._resources["tfstate"]. Then teardown could look something like

def teardown(self):
    tfstate = self._resources["tfstate"]
    s3_bucket = self._resources["s3_bucket"]

    # Release those resources

This would free teardown() from having to remember whatever resource naming scheme is used in setup().

cjao Sep 18, 2022
Maintainer Author

Another way to make setup() debuggable without tying it to a particular task might be to let users "tag" executors with a custom description.
Then the implementation of def setup() could refer to that tag in addition to the other executor attributes.

E.g.

@ct.electron(executor=AWSExecutor(tag="executor for training"))
def train(model):
    ...

Then the implementation of AWSExecutor.setup() could look something like

def setup(self):
    app_log.debug(f"Provisioning executor {self.tag}")
    app_log.debug(f"Requesting EC2 instance of type {self.instance_type}")
    # apply terraform, etc

venkatBala Sep 26, 2022

@cjao in the case of tagging, it seems it would have to be a required attribute for the electron... or can we pick some default value for each executor?

cjao · 2022-09-12T13:29:05Z

cjao
Sep 12, 2022
Maintainer Author

Here some more detailed design notes by domain.

Executor -- workflow construction:

Executor has some new public attributes:

instance_id -- uniquely identifies an executor instance
clone() -- returns a new instance of the executor with the same attributes as before except for instance_id.
shared -- determines whether the executor instance is to be shared by multiple nodes. Defaults to False.

When an executor instance is passed to an electron decorator, its shared attribute determines whether the instance is shared (same instance_id) or cloned (different instance_id) when the electron is bound to a graph node during build_graph(). Users can explicitly request a shared instance via Executor.get_shared_instance().

Example (with fake instance_ids):

ex1 = EC2Executor() # instance_id = 1
ex2 = ex1.get_shared_instance() # instance_id = 2

@ct.electron(executor=ex1)
def task_1():
    pass
    
@ct.electron(executor=ex2)
def task_2():
    pass
    
@ct.lattice:
def workflow():
    res11 = task_1()
    res12 = task_1()
    res21 = task_2()
    res22 = task_2()
    
workflow.build_graph()

This transport graph has four nodes with the following executor instance_id's:

(task_1, node_0) -> res11: 3 (cloned from instance 1)
(task_1, node_1) -> res12: 4 (cloned from instance 1)
(task_2, node_2) -> res21: 2 (copied from 2)
(task_2, node_3) -> res22: 2 (copied from 2)

In addition, collection and dunder tasks (@FyzHsn @wjcunningham7) are automatically assigned the same executor instance as the main task. Specifically, when Covalent generates either a collection or dunder task, it sets the main task's executor to shared=True and copies the instance to the generated tasks.

Example:

ex = EC2Executor() # instance_id = 1
@ct.electron(executor=ex)
def collect(arr: List):
    return list(arr)
    
@ct.lattice
def workflow():
    collection_1 = collect([1, 2])
    
workflow.build_graph()

Here, collect is bound to node 0 and Covalent assigns a collection task to node 1. Both nodes are assigned executor instance 1.

Executor -- backend:

Each Executor instance tracks the number of assigned tasks that have yet to finish running. Non-shared executors start with tasks_left=1.
Each Executor instance also tracks each running task's status.
Changes to setup and teardown semantics: During the first invocation of Executor.execute(), setup() is invoked to provision resources, and Executor.warmed_up is set after a successful call to setup(). The setup() method should return a dictionary of resource identifiers such as tfstate filename, AWS EC2 identifiers, etc. These are stored in the executor's runtime state and passed as the argument to teardown().
In view of the above change, the parameter name for teardown is changed from task_metadata to resource_metadata.
New methods for each executor to track running tasks:
- New methods get_task_data() and set_task_data() to persist metadata for running tasks in the executor instance. Each task is identified by (dispatch_id, node_id). The implementer of run() should store whatever identifiers are needed to query or cancel the task (e.g. Slurm job ID).
- Methods _get_task_status() and _set_task_status() to update each task's status.
- Upon each call to execute(), the executor registers the task using _set_task_status(dispatch_id, node_id, "RUNNING).
- New method _get_running_tasks to retrieve the metadata for all tasks.
New _finalize() method: cancel all running tasks and call teardown(). Meant to be used by the dispatcher upon workflow failure.
A plugin-specific cancel() method to cancel a task's in the executor backend. The user is responsible for storing whatever information is needed to cancel the job during run(). For example, the implementation of DaskExecutor.run() would call set_task_data(dispatch_id, node_id, "future", fut) where fut is the future returned by dask_client.submit(). Then, the cancel method for DaskExecutor would look like:

async def cancel(self, dispatch_id, node_id):
    fut = self.get_task_data(dispatch_id, node_id, "future")
    await fut.cancel()

An internal _cancel() method that invokes cancel() and updates the task status to CANCELLING. This would be a no-op unless the status is RUNNING when invoked.
When Executor.execute() handles an exception raised by Executor.run(), it needs to consider two cases:
- Failures due to cancellation --> set task status to CANCELLED
- All other failures --> set task status to FAILED.

Dispatcher

The dispatcher runs all tasks with the same instance_id in the same executor instance. In addition, the Dispatcher holds references to each executor instance in an ExecutorCache and finalizes its resources if the workflow fails before its tasks have all completed.

Changes to _run_task:

When _run_task catches an exception from Executor.execute(), it queries the executor via executor._get_task_status(dispatch_id, node_id) and set the node status accordingly.

ExecutorCache: tracks executor instances during workflow execution

id_instance_map: maps each Executor instance_id to the live instance.
tasks_per_instance: tracks how many tasks (all nodes except parameter nodes) are initially assigned to each instance_id in the workflow. This field is static, computed only once at the beginning of workflow execution.
finalize_executors() -- invoke finalize() for each executor instance to release all resources. This is used by the dispatcher if the workflow ends prematurely due to task failure.

When the dispatcher attempts to retrieve the executor for a task, it first looks in the ExecutorCache and only instantiates a new executor upon a cache miss.

Result Object
The running workflow's Result object holds a reference to the Executor cache. One motivation for this is to allow potentially allow sublattice workflows to inherit their parent's cache.

Unplanned tasks: The workflow_executor needs special handling because unlike the other executors, its list of tasks cannot be determined entirely statically; specifically, the number of sublattices to expand are only discovered during workflow execution. When the dispatcher encounters such "unplanned" tasks, it increments the workflow_executor's task count (if the workflow_executor is shared) to keep it alive until after postprocessing.

Executor plugin support

To make an executor plugin work with these semantics, a few simple rules must be observed:

setup() is designed to provision resources according to the specifications provided at workflow construction. Since the resources may be shared by multiple tasks, one should avoid embedding task-specific identifiers in the resources. In particular, any function pickle files should be uploaded in run(), not in setup().
setup() should return a dictionary of identifiers for the acquired resources, such as tfstate filenames, ARNs, session tokens, etc. These will be passed to teardown() in exactly the same form as returned by setup().
The implementation of run() should take care to store any task-specific metadata using set_task_data in case the task needs to be cancelled.
To support cancelling jobs, Executor.cancel() should be overriden.
Teardown(

Limitations

The benefits of sharing tasks in this manner will vary by executor. They will be largest for executors where Covalent has full control of the resource allocation and task scheduling, such as the EC2Executor. In that case setup() simply invokes terraform apply(), after which submitting each task is relatively cheap. The resource savings will be harder to realize for executors that already have their own backend schedulers unless those can be manually overridden.

3 replies

venkatBala Sep 26, 2022

@cjao this is great! one potential caveat it most executors are not at all using the setup/teardown hooks as of today (except I think the lambda). If the implementation is to go forward I think the provisioning logic has to be re-written for most executors and the use of setup/teardown be enforced somehow 👍🏻

cjao Sep 26, 2022
Maintainer Author

You are right, with this proposal executors would need to explicitly opt into sharing by implementing setup() and teardown(). Otherwise they would continue to behave as they currently do -- each task would be provisioned its own resources even if the executor instance is "shared".

venkatBala Sep 26, 2022

@cjao @santoshkumarradha can the setup/teardown methods be made abstractmethods?

FyzHsn · 2022-09-12T13:52:35Z

FyzHsn
Sep 12, 2022
Maintainer

@cjao @wjcunningham7 I want to bring up another point in resource sharing:

While we can continue to add attributes to the Executor metadata JSON such as instance id etc, it might be worth considering the addition of an executor_instance table in the Database.
The fields would then be id, and other metadata. The electron table can then be connected to the executor via executor_instance_id field. We can then join the electron and executor tables to easily check if the statuses of all the tasks corresponding to an executor instance are completed.
Additionally, we can have fields such as is_active. If setup has been invoked, is_active would be set to true. After all the tasks corresponding to the executor has been completed, we can through the tables determine that it's safe to apply teardown to the instance. At this point, is_active would be set to False.
A final benefit of adding this table is to get some transparency on the executor status when debugging.

Thoughts?

6 replies

wjcunningham7 Sep 12, 2022
Maintainer

yeah I agree an executor table is needed in the future. these points roughly make sense 👍

cjao Sep 12, 2022
Maintainer Author

This makes sense.

Additionally, we can have fields such as is_active. If setup has been invoked, is_active would be set to true. After all the tasks corresponding to the executor has been completed, we can through the tables determine that it's safe to apply teardown to the instance. At this point, is_active would be set to False.

I agree. The Executor.warmed_up field currently serves that purpose. Maybe something like setup_called would be a better name, but there is already an is_active field for UI soft-deletion.

FyzHsn Sep 21, 2022
Maintainer

It’s fair game to use the same field name for different tables. So there’s no clash between using is_active in both the lattice and executor tables.

kessler-frost Sep 23, 2022
Maintainer

I agree with having an executor table as well. The task of executor statuses however is mostly independent of what the DB is going to look like. The way the new statuses are written (In PR #1271), they'll be compatible with any direction we chose to take with these.

cjao Sep 29, 2022
Maintainer Author

Another application for the executor table would be to persist data describing resources (e.g. ec2 instances, slurm job Id's, etc) acquired during task execution. This data is currently tracked by the executor object during runtime but would be lost if the Covalent server crashes. If it were saved in persistent storage, users could in principle retrieve it later and release the resources manually.

Sharing executor resources among several electrons #943

cjao Jul 29, 2022 Maintainer

Replies: 11 comments · 21 replies

cjao Jul 29, 2022 Maintainer Author

FyzHsn Jul 29, 2022 Maintainer

cjao Jul 29, 2022 Maintainer Author

FyzHsn Jul 29, 2022 Maintainer

cjao Jul 29, 2022 Maintainer Author

cjao Jul 29, 2022 Maintainer Author

cjao Jul 29, 2022 Maintainer Author

santoshkumarradha Jul 29, 2022 Maintainer

FyzHsn Jul 30, 2022 Maintainer

cjao Jul 30, 2022 Maintainer Author

venkatBala Jul 30, 2022

kessler-frost Jul 29, 2022 Maintainer

cjao Jul 30, 2022 Maintainer Author

cjao Aug 1, 2022 Maintainer Author

cjao Aug 1, 2022 Maintainer Author

cjao Sep 9, 2022 Maintainer Author

santoshkumarradha Sep 12, 2022 Maintainer

santoshkumarradha Sep 12, 2022 Maintainer

cjao Sep 12, 2022 Maintainer Author

cjao Sep 18, 2022 Maintainer Author

venkatBala Sep 26, 2022

cjao Sep 12, 2022 Maintainer Author

Executor -- workflow construction:

Executor -- backend:

Dispatcher

Executor plugin support

Limitations

venkatBala Sep 26, 2022

cjao Sep 26, 2022 Maintainer Author

venkatBala Sep 26, 2022

FyzHsn Sep 12, 2022 Maintainer

wjcunningham7 Sep 12, 2022 Maintainer

cjao Sep 12, 2022 Maintainer Author

FyzHsn Sep 21, 2022 Maintainer

kessler-frost Sep 23, 2022 Maintainer

cjao Sep 29, 2022 Maintainer Author

cjao
Jul 29, 2022
Maintainer

Replies: 11 comments 21 replies

cjao
Jul 29, 2022
Maintainer Author

FyzHsn Jul 29, 2022
Maintainer

cjao Jul 29, 2022
Maintainer Author

FyzHsn
Jul 29, 2022
Maintainer

cjao Jul 29, 2022
Maintainer Author

cjao Jul 29, 2022
Maintainer Author

cjao
Jul 29, 2022
Maintainer Author

santoshkumarradha
Jul 29, 2022
Maintainer

FyzHsn Jul 30, 2022
Maintainer

cjao Jul 30, 2022
Maintainer Author

kessler-frost
Jul 29, 2022
Maintainer

cjao
Jul 30, 2022
Maintainer Author

cjao
Aug 1, 2022
Maintainer Author

cjao
Aug 1, 2022
Maintainer Author

cjao
Sep 9, 2022
Maintainer Author

santoshkumarradha Sep 12, 2022
Maintainer

santoshkumarradha Sep 12, 2022
Maintainer

cjao Sep 12, 2022
Maintainer Author

cjao Sep 18, 2022
Maintainer Author

cjao
Sep 12, 2022
Maintainer Author

cjao Sep 26, 2022
Maintainer Author

FyzHsn
Sep 12, 2022
Maintainer

wjcunningham7 Sep 12, 2022
Maintainer

cjao Sep 12, 2022
Maintainer Author

FyzHsn Sep 21, 2022
Maintainer

kessler-frost Sep 23, 2022
Maintainer

cjao Sep 29, 2022
Maintainer Author