-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Add needimport
caching and needs_import_cache_size
configuration
#1297
base: master
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #1297 +/- ##
==========================================
+ Coverage 86.87% 86.99% +0.11%
==========================================
Files 56 60 +4
Lines 6532 6998 +466
==========================================
+ Hits 5675 6088 +413
- Misses 857 910 +53
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Hi @chrisjsewell , first of all thanks for tackling this topic, even before I had conducted the performance measurement :-)
How do we handle that the needs.json import source has changed? Sphinx will not consider the importing document changed, so in this case do we need to trigger a full re-build? Or does sphinx-needs have a way to handle this? Is this maybe a general problem of needimport? |
@arwedus It informs sphinx of the documents dependency on the file (similar to e.g.
so yes sphinx will check if it's mtime is changed and, if so, re-build all dependant documents |
fcf40f8
to
a887c9d
Compare
a887c9d
to
bc4cf42
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should really help with many needimports of the same file.
data = needs_import_list["versions"][version] | ||
|
||
# TODO this is not exactly NeedsInfoType, because the export removes/adds some keys | ||
needs_list: dict[str, NeedsInfoType] = data["needs"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't know how to feel about calling a dictionary a list :)
needs_list = needs_list_filtered | ||
# note we need to deepcopy here, as we are going to modify the data, | ||
# but we want to ensure data referenced from the cache is not modified | ||
needs_list = deepcopy(needs_list_filtered) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe there's a code logic options that combines the 2 deepcopy
Co-authored-by: Marco Heinemann <marco.heinemann@useblocks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more request from my side.
|
||
.. versionadded:: 3.1.0 | ||
|
||
Sets the maximum number of needs cached by the :ref:`needimport` directive, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ImportCache
stores
self._cache: OrderedDict[tuple[str, float], dict[str, Any]] = OrderedDict()
where tuple[str, float]
is path and mtime.
Items are popped until the dict size is less than needs_import_cache_size
.
So not the number of needs is relevant, but the number of distinct needs.json paths (+mtime combination).
If I'm right, please rephrase to make that clearer.
This PR introduces "lazy and size-bounded" caching for the reading of
needs.json
in theneedimport
directive.This reads/writes to an in-memory cache, keyed on the path and mtime,
and is bounded by the
needs_import_cache_size
(configurable by the user),which sets the maximum number of needs allowed in the cache,
to ensure we do not have large increases in build memory usage.
Note, in #1148 there was discussion of "centralised, pre-caching",
however, that is problematic because: