Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Take advatange of prefetching objects from the database where appropriate #180

Open
evz opened this issue Feb 19, 2018 · 1 comment
Open

Comments

@evz
Copy link
Contributor

evz commented Feb 19, 2018

In the case where we have a view that requires loading a lot of bills, we should take advantage of using Django's prefetch_related and select_related methods to limit the number of queries. Since Django lazily evaluates querysets, this can mean that anytime we are accessing related objects in a template, under the hood Django is executing a one off query to fetch that object from the database. This can very quickly multiply to mean that hundreds or thousands of queries are being fired anytime someone loads a page.

To make matters worse, we are also accessing related items via model methods which means that, even if we use prefetch_related or select_related in the view code, Django will ignore those objects that it's already fetched from the database and fire off more queries. A good example of this is the current_action property on the Bill model which is not only used on its own but also leveraged by several other model methods (and the methods that use those methods) on the Bill model and elsewhere (including in template code).

A solution to this problem that I just stumbled across today is to check if Django has already cached those objects on the database connection and, if so, use them instead of making another round trip to the database. So, a way of rewriting the current_action property would be like so:

# When executing a query like so:
bills = Bill.object.all().prefetch_related('actions')

# Django will cache the related objects which you can access like so within a model method

class Bill(models.Model):
    ... fields ...

    @property
    def current_action(self):
        if hasattr(self, '_prefetched_objects_cache') and 'actions' in self._prefetched_objects_cache:
            # Return the most recent cached action object
            return sorted(self._prefetched_objects_cache['actions'],
                          key=lambda x: x.order,
                          reverse=True)[0]

        else:
            # If the view code did not use "prefetch_related" for the action objects, fall back to the current way of doing things.
            return self.actions.all().order_by('-order').first() if self.actions.all() else None

Obviously, this is just an example of where we can make some improvements on those model methods. There are probably a lot more that can be improved in a similar way.

@reginafcompton
Copy link
Contributor

reginafcompton commented Feb 28, 2018

I found a potential place to prefetch actions.

In the legislation_item partial we iterate over bill topics and call the tags partial, which uses get_last_action_date: https://github.com/datamade/django-councilmatic/blob/master/councilmatic_core/models.py#L350

This nested business means that we duplicate the same query (sometimes) dozens of times. I'd like to see if we could re-think this logic and prefetch actions somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants