-
Notifications
You must be signed in to change notification settings - Fork 427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Async python support: aiopynamodb #802
Comments
The discussion here is relevant: #525 (comment) I'd like to support asyncio natively in the library, but I'm still a little hesitant to adopt aiobotocore right as it's not maintained by AWS. We don't rely on all that much of botocore right now, so one option would be to drop that altogether and provide a separate async interface |
Any idea when this might happen? We could really use this feature right now. I've been attempting to do this myself but I've been having to duplicate a lot of your code for a few small changes. |
There is another approach that is used by many libraries out there (keep reading for examples): When a library exposes a high-latency function, for instance: for item in TestModel.view_index.query(1):
print("Item queried from index: {0}".format(item)) One can wrap the calls in a sub-thread via loop.run_in_executor. Since that's is a little verbose there are nice libraries to make it human-friendly, for example aioextensions So the syntax would be something like: from aioextensions import in_thread
for item in await in_thread(TestModel.view_index.query, 1):
print("Item queried from index: {0}".format(item)) Which would run the high-latency thing in a sub-thread that allows for concurrency. It's a very minimalistic interface and requires no work from pynamodb since it's on the consumer side to do the wrapping: from aioextensions import in_thread, collect
# Equivalent to pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4)
one_query = await in_thread(pynamodb_func, arg_1, arg_2, kwarg_a=3, kwarg_b=4)
# Equivalent to pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4) but all queries concurrently (overlapping in time) and fast!!
many_queries = await collect([
in_thread(pynamodb_func, arg_1, arg_2, kwarg_a=kwarg_a, kwarg_b=kwarg_b)
for arg_1, arg_2, kwarg_a, kwarg_b in [long list of things to fetch]
]) There is another alternative and is providing _async versions of the functions, which internally could use the mentioned wrappers hiding them from the final user: def pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4) -> Data:
....
async def async_pynamodb_func(arg_1, arg_2, kwarg_a=3, kwarg_b=4) -> Data:
return await in_thread(pynamodb_func, arg_1, arg_2, kwarg_a=kwarg_a, kwarg_b=kwarg_b) The library also offers some nice helpers that we could find useful like workers, batching and rate limits. I think I'm volunteering to implement the async wrappers if you think it's a nice approach, you tell me! @garrettheel These are examples of the mentioned sub-thread wrapping: I've personally used it in production and the benefits from concurrency are worth the small overhead it adds to every call It's common to use |
I've been experimenting with a different approach in #853, which could be characterized as a hackier version of the above suggestion (to the benefit of not requiring threads). |
Can also be done using
|
Would it be possible to create a separate async module in this library and create a similar but async api for people to use? There are a few of third party async dyanmo/boto3 libraries available for use. It could be used until Amazon finally updates boto3 to support asyncio (😔 cries from botocore maintainer). I think this approach has a lot of benefits. PynamoDB will have a working async module when boto3 supports it, and if designed correctly, could be swapped out with these third party libs dynamically. Would the maintainer be okay with that? |
I am working on a project that we will benefit from adding async support to this package. We will implement our solution basically wrapping everything you have using Gevent. Why Gevent? Because you do not need to worry about async/await syntax, you do not need to rewrite everything defining async methods. We will probably implement this before June, so as soon as I get some results from it, I will come back with a PR implementing it. In the meantime, I would really appreciate some feedback providing you with more context. Gevent is great but for example, the support for Windows is limited: http://www.gevent.org/install.html#supported-platforms Probably it will narrow the supported Python versions that your library already supports as well. |
@AbendGithub I think long-term async/await is the future of python, though. Gevent isn't native or widely used by most python programmers. |
We use pynamodb with gevent pretty much everywhere at Lyft without any modifications to this library (with standard gevent monkey-patching). There's been a lot of community interest in adding an asyncio layer to this library over the years. It's not entirely trivial and will probably result in lots of duplication (seen this in redis-py) which is probably why we haven't yet. I'd also see it as a negative testimony to the asyncio approach (aka blue/green functions), but this train left the station and most of us are invested into one of those two approaches, so I can definitely see the value in an asyncio layer. |
Yeah, I know the blue/green function debate is quite polarizing. However, as you said, the language is natively adopting the once approach. Eventually, I feel like even boto3 will be forced to adopt asyncio. |
Hey just curious if this ever caught traction. I feel like asyncio is one of the easiest ways I find to improve io bound apps. |
https://github.com/aio-libs/aiobotocore
https://github.com/terrycain/aioboto3
The text was updated successfully, but these errors were encountered: