Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing index on 'submissions.created_at' #559

Merged
merged 2 commits into from
Dec 9, 2019

Conversation

edavey
Copy link
Contributor

@edavey edavey commented Dec 4, 2019

In July 2018 it was noted that the submissions.created_at
field ought to be indexed as it was now used in a query:

#23

This is Task#latest_submission
(f46dd46)
which is used by the 'no_business' endpoint
(/v1/tasks/#{task.id}/no_business)

As the query selecting the most recent submission uses:

has_one :latest_submission, -> { order(created_at: :desc) }

I've opted for an index which is 'sorted' to suit.

This work was requested in order to improve the problematic
daily export procedure invoked via
DataWarehouseExport.generate! which runs out of memory
and times out regularly.

This unindexed field (submissions.created_at) isn't
actually involved in any of the 4 "extractions" which run.
We believe that the 2 problem extractions are

  • Export::Invoices::Extract
  • Export::Contracts::Extract

and that improvements to these may be brought about by
adding the following indexes:

  • tasks.updated_at
  • submissions.updated_at
  • submission_entries.updated_at

The following PR will add these indexes.

@edavey edavey requested a review from leeky December 4, 2019 17:18
Copy link
Contributor

@leeky leeky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small suggestion, but otherwise LGTM! :shipit:

@@ -0,0 +1,7 @@
class AddIndexOnSubmissionsCreatedAt < ActiveRecord::Migration[5.2]
def change
add_index :submissions,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth adding the index concurrently, particularly since we're still inside a submission window - e.g. https://semaphoreci.com/blog/2017/06/21/faster-rails-indexing-large-database-tables.html ?

In July 2018 it was noted that the `submissions.created_at`
field ought to be indexed as it was now used in a query:

#23

This is `Task#latest_submission`
(f46dd46)
which is used by the 'no_business' endpoint
(`/v1/tasks/#{task.id}/no_business`)

As the query selecting the most recent submission uses:

`has_one :latest_submission, -> { order(created_at: :desc) }`

I've opted for an index which is 'sorted' to suit.

This work was requested in order to improve the problematic
daily export procedure invoked via
`DataWarehouseExport.generate!` which runs out of memory
and times out regularly.

This unindexed field (`submissions.created_at`) isn't
actually involved in any of the 4 "extractions" which run.
We believe that the 2 problem extractions are

- `Export::Invoices::Extract`
- `Export::Contracts::Extract`

and that improvements to these may be brought about by
adding the following indexes:

- `tasks.updated_at`
- `submissions.updated_at`
- `submission_entries.updated_at`

The following PR will add these indexes.

But we actually believe that the main problem is Postgres
memory:

- the index on SubmissionEntry no longer fits in memory: we
  are going to look into increasing the memory allocation in
  Postgres

- the `submission_entries.*` glob in the SELECT statement
  may include more data than is required: we will
  explore whether a sub-set of the SubmissionEntry fields can
  be returned by the DB

Zendesk
: https://dxw.zendesk.com/agent/tickets/10463

Trello
: https://trello.com/c/GeaLyAWy/1162
@edavey edavey force-pushed the add-index-to-submissions-created-at branch from 6b18d26 to ee9e5a1 Compare December 9, 2019 12:00
@edavey edavey merged commit 651fb50 into develop Dec 9, 2019
@edavey edavey deleted the add-index-to-submissions-created-at branch December 9, 2019 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants