-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Speed up Export::Submissions::Extract
Background It was proposed that the very slow daily `DataWarehouseExport.generate!` could be improved by adding indexes. Back in July 2018 Tekin identified that `Submissions#created_at` should be indexed (#23) as it started to be used in `Task#latest_submission` (f46dd46). However `submissions.created_at` is not used in any of the four 'extractions' which make up the daily data warehouse export. But there are some other unindexed fields involved in extractions, as follows: `Export::Tasks::Extract` -> uses `Task#updated_at` (unindexed) `Export::Submissions::Extract` -> uses `Submission#updated_at` (unindexed) `Export::Invoices::Extract` -> uses `SubmissionEntry#updated_at` (unindexed) `Export::Contracts::Extract` -> uses `SubmissionEntry#updated_at` (unindexed) Accordingly there are some PRs coming through which: - 559: Add index to `submissions.created_at` - 560: Add indexes to `tasks.updated_at`, `submissions.updated_at`, `submission_entries.updated_at` - 561: Increase `maintenance_work_mem` setting for Postgres to allow these indexes to be added This commit improves the speed of one of the 4 extractions: `Export::Submissions::Extract`. It was identified (thanks Russell Garner!) that the subquery joining 'invoices' was a major bottleneck. In local tests, removing the join reduced a sample query from 26s to 0.4s. In order to properly remove this element from the query: - it has been recognised that `entry_count` (the count of submission entries of the type 'invoice') is not needed as users derive this information using other tooling in the 'data warehouse' - the `total_management_charge` projection could be removed, as since d7db363 this value has been precomputed and stored on ingestion as `submissions.management_charge_total`. This field was already being returned as part of the top level selection `SELECT submissions.*`. - the `invoice_value` projection (`invoices.total_value`) is not required as again this information is available in the 'data warehouse'. - now that `invoice_entry_count` is no longer available it is now necessary find a new way to determine the `submission_type` ('no_business' or 'file'). It's been agreed that the presence (or not) of `submission_file_type` is sufficient to know if there’s a ‘file’ to be had or whether on the other hand it's a case of ‘no business’. As the structure of the exports (the headings or columns) has changed we've taken the opportunity to describe both: - the expected CSV headers more clearly as a vertical list, and - the values of the example row more closely by ensuring that the expected values as well as being present are also in the expected column.
- Loading branch information
Showing
7 changed files
with
90 additions
and
73 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters