Lazy sources #82

beasteers · 2020-04-24T14:36:39Z

This implements a discussion had in #80.

Basically, ConfigSource control the entire concept of data loading and state, so you can have partially loaded datasets and only fall back on lower priority sources when the keys are missing.

So you can do:

import confuse
config = confuse.Configuration('mypkg', source=[
    confuse.EnvSource('MYPKG'),
    '~/mypkg',
    confuse.YamlSource('/etc/mypkg-settings.yml'),
])
config.sources

outputs

[EnvSource({}, prefix=MYPKG, separator=__),
 YamlSource({'a': 5}, filename=/Users/bsteers/mypkg/config.yaml, default=False),
 YamlSource([Source doesn't exist], filename=/etc/mypkg-settings.yml, default=False),

 YamlSource([Source doesn't exist], filename=/Users/bsteers/.config/mypkg/config.yaml, default=False)]

to demonstrate ConfigSource more directly (pulled from the other PR):

# equivalent
c = ConfigSource.of('asdf.yml')
c = YamlSource('asdf.yml')
# lazy loading
c = ConfigSource.of('asdfa.yml')
print('loaded?', c.loaded, c)
print('a =', c['a'])
print('loaded?', c.loaded, c)

outputs

loaded? False ConfigSource({}, asdfa.yml, False)
a = 5
loaded? True ConfigSource({'a': 5, 'b': 6, 'file': 'asdfa.yml'}, asdfa.yml, False)

beasteers · 2020-04-24T16:11:28Z

Things to fix:

~~python2 functools.wraps doesn't work for methods - hmmm might need to update func name/doc manually or something~~
~~flake "prettiness" errors~~
~~add configsource tests~~
For some reason, in a very specific scenario (tox nosetests w/ python 2.7), calling dict(source) doesn't call dict.keys as expected. From what I can tell, in that specific case, it seems to be calling dict.keys and not the overridden ConfigSource.keys (where we call load).

# python 2.7
# runs fine
nosetests tests/test-sources.py
# throws error
tox tests/test-sources.py

I spent an hour this morning trying to pin this down, but I still can't figure out how the problem is arising. It's not exclusively a py2.7 problem, because I can get it working in a py2 environment using nosetests/pytest

This shouldn't ever be an issue in production tho because I changed dict(source) => dict(source.load()) in resolve to make sure this case is covered, but it's just a weird issue that's bugging me.

… luck :/

sampsyo

This seems cool so far; thanks!! I jotted a few low-level questions inline.

I realize that this is a lot to ask, but is there any chance I could get your help splitting this up into one feature at a time? I would love to do the lazy source thing first and make sure that causes no problems with existing apps that use LazyConfig (i.e., hopefully it works exactly the same way as the old mechanism for laziness). Then we can merge the more convenient Config construction thing next, based on that. Would that make sense?

sampsyo · 2020-04-26T15:24:02Z

confuse.py

+
+
+def _load_first(func):
+    '''Call self.load() before the function is called - used for lazy source


Can we stick with the PEP 8-recommended triple-double quotes for docstrings?

Yeah sure. I don't like the style of unnecessary double quotes when single will do, but since it's pep8, ok.

sampsyo · 2020-04-26T15:27:27Z

confuse.py

+    try:
+        return wraps(func)(inner)
+    except AttributeError:
+        # in v2 they don't ignore missing attributes


For a little more context: what does the AttributeError on Python 2 look like?

Traceback (most recent call last): File "/home/travis/build/beetbox/confuse/example/__init__.py", line 3, in <module> import confuse File "/home/travis/build/beetbox/confuse/confuse.py", line 157, in <module> class ConfigSource(dict): File "/home/travis/build/beetbox/confuse/confuse.py", line 213, in ConfigSource __getitem__ = _load_first(dict.__getitem__) File "/home/travis/build/beetbox/confuse/confuse.py", line 145, in _load_first @wraps(func) File "/opt/python/2.7.15/lib/python2.7/functools.py", line 33, in update_wrapper setattr(wrapper, attr, getattr(wrapped, attr)) AttributeError: 'method_descriptor' object has no attribute '__module__'

In py3, they skip missing attributes in functools.wraps.

sampsyo · 2020-04-26T15:38:17Z

confuse.py

+            self.filename, self.default)
+
+    @property
+    def exists(self):


I'm a little confused about why this is necessary—shouldn't it be an error to try using a source that doesn't have a file that exists?

The idea of a source that doesn't exist comes from the desire to create a config dir that doesn't exist.

The full list of sources is created without touching the filesystem, and then on read, it will go thru each source and do any file loading and missing directory creation. If we don't want to be creating a config directory, then we can throw away sources who's file doesn't exist.

As stated elsewhere - by default yaml files will throw an error if they don't exist.

this abstraction was to let other sources (EnvSource, ConfigSource({}), CustomSource) define whether they should be considered existing or not. I initially added this for a more informative __repr__, but it may be useful elsewhere

sampsyo · 2020-04-26T15:40:27Z

confuse.py

+        '''Ensure that the source is loaded.'''
+        if not self.loaded:
+            self.config_dir()
+            self.loaded = self._load() is not False


Similarly, I'm a tad confused about why this could ever be false… shouldn't load either always succeed or throw an exception? It seems like things could get pretty confusing if sources are allowed to silently fail.

Or if the idea is that missing files should be loadable and just silently produce no values (i.e., an empty dict), that's cool too… but perhaps that should be optional? (And it should still mark the source as "loaded" so that we don't keep trying?)

That was a mechanism to allow a source subclass to signify that a source wasn't loaded and to try again next time. It's only if a _load function explicitly returns False.

It seems like things could get pretty confusing if sources are allowed to silently fail.

Whether they are silently ignored is optional and is disabled by default.

(YamlSource(path, optional=True) will ignore a source if it's missing, YamlSource(path) will throw an error like normal).

I added it in because that's how the user config is handled (the source is kept so it will create the config dir on read).

the idea is that missing files should be loadable and just silently produce no values

yes If a user wants an optional source, they can set optional=True

it should still mark the source as "loaded" so that we don't keep trying?

The reason for the retry mechanism is to allow a source to be loaded whenever it becomes available. As a general mechanism, it's really flexible and powerful, but I agree that allowing retry and delayed loading could have unexpected results in some situations so it should probably be disabled by default.

sampsyo · 2020-04-26T15:44:53Z

confuse.py

-        a `ConfigSource` object. This lets a function accept either type
-        of object as an argument.
-        """
+    def is_of_type(cls, value):


This mechanism for automatically detecting which type of source to use is a little too "magical," I think, for a first cut. If it's OK with you, can we leave this out for now? (Concretely, an issue with searching for all subclasses of a given class is that it is implicitly extensible—anyone who creates a subclass of ConfigSource will automatically/transparently be extending this overload system, which can be somewhat surprising.)

Perhaps as a more convenient alternative to explicitly using YamlSource we can just hard-code some API to use this that class exclusively?

Yeah I tend to agree - to be more explicit, we can just mimic as_template.

Currently, by default any class extending ConfigSource would have to extend is_of_type in order to use source inference so there isn't as much risk.

My bigger concern is that resolution order is (?) undefined (?) or at least not immediately apparent.

Tbh it was just an easy way to try to have ConfigSource be externally extensible without modifying the confuse core, but I agree that it isn't optimal and I'm fine with looking at an alternative method.

sampsyo · 2020-04-26T15:47:00Z

confuse.py

+        self.update(self._loader(self.filename))
+
+
+class EnvSource(ConfigSource):


This seems pretty cool! But it's a separate feature—maybe we can discuss it separately? (We can also look into writing some documentation for the new variable-based route to configuration…)

yes this can be added in a separate PR. it was just a simple demonstration of source extensibility.

sampsyo · 2020-04-26T15:49:01Z

confuse.py


    def resolve(self):
-        return ((dict(s), s) for s in self.sources)
+        return ((dict(s.load()), s) for s in self.sources)


Perhaps dict(s) on a source should implicitly call load? (In fact, maybe it already does?)

Yeah it totally does - and all tests work fine locally, except in tox/travis python 2.7 tests (but not python 2 w/ pytest/nosetests)...

I'm hoping to get rid of this tho. See comment below

sampsyo · 2020-04-26T15:52:02Z

confuse.py

-        self.sources.append(ConfigSource.of(obj))
+        src = ConfigSource.of(obj)
+        self.sources.append(src)
+        return src


It looks like add and set (below) both now return the newly-created configuration source. That seems cool enough, but can you perhaps elaborate a bit on the use case you had in mind? In a way it doesn't seem that hard for clients to keep a reference to their own source that they pass in, if that's wha they want. (Or maybe this is used somewhere else internally?) In any case, it would be great to add docs for this new behavior too if we need it…

This was more for sources that are implicitly created internally using ConfigSource.of:

config.add({'a': 5}) # calls ConfigSource.of(value)

People can access via config.sources[i], but I figured it's less error prone and there isn't really harm to returning it. It only came up because a test was trying to reference by index which is currently dependent on the existence of files.

sampsyo · 2020-04-26T15:58:02Z

Ah, I see belatedly that you already commented on that extra dict(...) thing! Oops. That's super weird. My best guess is that the dict constructor might just iterate over the keys, which would call __iter__ but not __keys__? Pretty strange…

beasteers · 2020-04-26T16:39:09Z

Ah, I see belatedly that you already commented on that extra dict(...) thing! Oops. That's super weird. My best guess is that the dict constructor might just iterate over the keys, which would call iter but not keys? Pretty strange…

But I'm wrapping __getitem__, __iter__, keys, values....... so it should load prior to all of those

I tried printing out in __getattribute__ and:

in the passing tests, it prints out: keys, keys, loaded, load, etc.
but in the failed test, it only prints out: keys

~~So that implies that it's just calling dict.keys and not our overridden method - which is really weird and idk why it'd be doing that.~~

edit: from what I can tell the problem seems to be that it's getting .keys, but not calling it.

Adding this caused the test to pass

class ConfigSource(dict):
    def __getattribute__(self, k):
        x = super(ConfigSource, self).__getattribute__(k)
        if k == 'keys':
            print(k, x())
        return x

beasteers added 9 commits April 21, 2020 23:01

adding config files as an argument

fe3b225

fixing formatting errors and maybe a few other errors

3b62991

Merge branch 'explicit_config_files' into lazy-source

58cfdf4

fixing default source error

9085405

adding lazy config source

0a0f5d2

Merge branch 'explicit_config_files' into lazy-source

8207f2b

moved automatic configs to the bottom of the stack like before

fc91ed0

adding env source

c913d71

fixed skip missing, repr, expanduser

28a6330

beasteers added 6 commits April 25, 2020 01:42

fixing travis errors

2b23f40

removed add_base, add+set return source, added tests

0ceb9c3

wtf .. "works fine on my machine !!" ugh

ea65b1e

need to check how dict is being cast on travis py2 (p2 works for me)

d043718

tried to figure dict cast thing for like an hour this morning with no…

2692d49

… luck :/

oops add docs back in

98592ef

sampsyo reviewed Apr 26, 2020

View reviewed changes

beasteers mentioned this pull request Apr 26, 2020

added basic lazy config - try #2 #83

Closed

beasteers closed this Jun 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lazy sources #82

Lazy sources #82

beasteers commented Apr 24, 2020

beasteers commented Apr 24, 2020 •

edited

Loading

sampsyo left a comment

sampsyo Apr 26, 2020

beasteers Apr 26, 2020

sampsyo Apr 26, 2020

beasteers Apr 26, 2020 •

edited

Loading

sampsyo Apr 26, 2020

beasteers Apr 26, 2020 •

edited

Loading

sampsyo Apr 26, 2020

beasteers Apr 26, 2020 •

edited

Loading

sampsyo Apr 26, 2020

beasteers Apr 26, 2020 •

edited

Loading

sampsyo Apr 26, 2020

beasteers Apr 26, 2020 •

edited

Loading

sampsyo Apr 26, 2020

beasteers Apr 26, 2020 •

edited

Loading

sampsyo Apr 26, 2020

beasteers Apr 26, 2020 •

edited

Loading

sampsyo commented Apr 26, 2020

beasteers commented Apr 26, 2020 •

edited

Loading



		def _load_first(func):
		'''Call self.load() before the function is called - used for lazy source

		self.update(self._loader(self.filename))


		class EnvSource(ConfigSource):

Lazy sources #82

Lazy sources #82

Conversation

beasteers commented Apr 24, 2020

beasteers commented Apr 24, 2020 • edited Loading

sampsyo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beasteers Apr 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beasteers Apr 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beasteers Apr 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beasteers Apr 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beasteers Apr 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beasteers Apr 26, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

beasteers Apr 26, 2020 • edited Loading

Choose a reason for hiding this comment

sampsyo commented Apr 26, 2020

beasteers commented Apr 26, 2020 • edited Loading

beasteers commented Apr 24, 2020 •

edited

Loading

beasteers Apr 26, 2020 •

edited

Loading

beasteers Apr 26, 2020 •

edited

Loading

beasteers Apr 26, 2020 •

edited

Loading

beasteers Apr 26, 2020 •

edited

Loading

beasteers Apr 26, 2020 •

edited

Loading

beasteers Apr 26, 2020 •

edited

Loading

beasteers Apr 26, 2020 •

edited

Loading

beasteers commented Apr 26, 2020 •

edited

Loading