Carica is a python application configurator, interfacing between a pure python config module, and TOML representation of that module.
A huge thank you goes to @sdispater, author of the fantastic tomlkit library, which makes this project's variable docstrings retaining features possible.
Python applications can be configured in a number of ways, each with its own advantages and limitations.
Common Configuration Methods
Method | Advantages | Problems |
---|---|---|
Environment variables/Command line arguments |
|
|
TOML config file |
|
|
Python module with variables |
|
|
Carica aims to mix the best bits from two of the most convenient configuration methods, acting as an interface between pure python modules and TOML config files.
To use Carica, your application configuration should be defined as a python module.
Example Application
loginApp.py
import cfg
import some_credentials_manager
import re
print(cfg.welcome_message)
new_user_data = {}
for field_name, field_config in cfg.new_user_required_fields.items():
print(field_config['display'] + ":")
new_value = input()
if re.match(new_value, field_config['validation_regex']):
new_user_data[field_name] = new_value
else:
raise ValueError(f"The value for {field_name} did not pass validation")
some_credentials_manager.create_user(new_user_data)
cfg.py
welcome_message = "Welcome to the application. Please create an account:"
new_user_required_fields = {
"username": {
"display": "user-name",
"validation_regex": "[a-z]+"
},
"password": {
"display": "pw",
"validation_regex": "\\b(?!password\\b)\\w+"
},
}
Carica is able to auto-generate a default TOML config file for your application, with the values specified in your python module as defaults:
>>> import cfg
>>> import carica
>>> carica.makeDefaultCfg(cfg)
Created defaultCfg.toml
The above code will produce the following file:
defaultCfg.toml
welcome_message = "Welcome to the application. Please create an account:"
[new_user_required_fields]
[new_user_required_fields.username]
display = "user-name"
validation_regex = "[a-z]+"
[new_user_required_fields.password]
display = "pw"
validation_regex = "\\b(?!password\\b)\\w+"
Carica will map the variables given in your config file to those present in your python module. Since the config python module contains default values, Carica does not require every variable to be specified:
myConfig.toml
[new_user_required_fields]
[new_user_required_fields.avatar]
display = "profile picture"
validation_regex = "[a-z]+"
>>> import cfg
>>> import carica
>>> carica.loadCfg(cfg, "myConfig.toml")
Config successfully loaded: myConfig.toml
>>> import loginApp
Welcome to the application. Please create an account:
profile picture:
123
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "loginApp.py", line 14, in <module>
raise ValueError(f"The value for {field_name} did not pass validation")
ValueError: The value for avatar did not pass validation
When encountering a comment in your python config module, Carica will treat it as a variable 'docstring' in the following cases:
- Inline comments on the same line as a variable declaration
- Line comments immediately preceeding a variable declaration ('preceeding comments') *Beta feature: still in testing*
- Line comments immediately preceeding an existing preceeding comment *Beta feature: still in testing*
Carica will consider your variable docstrings when building TOML config files:
cfg.py
# This is shown to the user when the application is first launched
# No validation is performed on this string
welcome_message = "Welcome to the application. Please create an account:"
new_user_required_fields = { # Each field should specify a 'display' (formatted field name shown to users) and a 'validation_regex', which inputted values will be checked against
"username": {
"display": "user-name",
"validation_regex": "[a-z]+"
},
"password": {
"display": "pw",
"validation_regex": "\\b(?!password\\b)\\w+"
},
}
>>> import cfg
>>> import carica
>>> carica.makeDefaultCfg(cfg)
Created defaultCfg.toml
The above code will produce the following file:
defaultCfg.toml
# This is shown to the user when the application is first launched
# No validation is performed on this string
welcome_message = "Welcome to the application. Please create an account:"
[new_user_required_fields] # Each field should specify a 'display' (formatted field name shown to users) and a 'validation_regex', which inputted values will be checked against
[new_user_required_fields.username]
display = "user-name"
validation_regex = "[a-z]+"
[new_user_required_fields.password]
display = "pw"
validation_regex = "\\b(?!password\\b)\\w+"
Carica will handle non-primative variable types according to a very simple design pattern:
class SerializableType:
def serialize(self, **kwargs): ...
@classmethod
def deserialize(cls, data, **kwargs): ...
Any type which defines serialize
and deserialize
member methods will be automatically serialized during config generation, and deserialized on config loading.
serialize
must return a representation of your object with primative types - types which can be written to toml.deserialize
must be a class method, and should transform a serialized object representation into a new object.
Carica enforces this pattern on non-primative types using the SerializableType
type protocol, which allows for duck-typed serializable types. This protocol is exposed for use with isinstance
.
Projects which prefer strong typing may implement the carica.ISerializable
interface to enforce this pattern with inheritence. Carica will validate serialized objects against the carica.PrimativeType
type alias, which is also exposed for use.
cfg.py
class MySerializableType:
def __init__(self, myField):
self.myField = myField
def serialize(self, **kwargs):
return {"myField": self.myField}
@classmethod
def deserialize(self, data, **kwargs):
return MySerializableClass(data["myField"])
mySerializableVar = MySerializableClass("hello")
>>> import cfg
>>> import carica
>>> carica.makeDefaultCfg(cfg)
Created defaultCfg.toml
The above code will produce the following file:
defaultCfg.toml
[mySerializableVar]
myField = "hello"
myConfig.toml
[mySerializableVar]
myField = "some changed value"
>>> import cfg
>>> import carica
>>> carica.loadCfg(cfg, "myConfig.toml")
Config successfully loaded: myConfig.toml
>>> cfg.mySerializableVar.myField
some changed value
Carica provides serializable models that are ready to use (or extend) in your code. These models can be found in the carica.models
package, which is imported by default.
Removes the need to write boilerplate serializing functionality for dataclasses. This class is intended to be extended, adding definitions for your dataclass's fields. Extensions of SerializableDataClass
must themselves be decorated with @dataclasses.dataclass
in order to function correctly.
An OS-agnostic filesystem path, extending pathlib.Path
. The serializing/deserializing behaviour added by this class is minimal, a serialized SerializablePath
is simply the string representation of the path, for readability. All other behaviour of pathlib.Path
applies, for example. SerializablePath
can be instantiated from a single path: SerializablePath("my/directory/path")
, or from path segments: SerializablePath("my", "file", "path.toml")
.
datetime.datetime
is already considered a primitive type by TomlKit, and so no serializability needs to be added for you to use this class in your configs. However, datetime.timedelta
is not serializable by default. SerializableTimedelta
solves this issue as a serializable subclass. As a subclass, all timedelta
behaiour applies, including the usual constructor. In addition, SerializableTimedelta.fromTimedelta
is a convenience class method that accepts a datetime.timedelta
and constructs a new SerializableTimedelta
from it.
The recommended usage pattern for SerializableDataClass
is to separate your models into a separate module/package, allowing for 'schema' definition as python code. This pattern is not necessary, model definition can be done in your config file.
configSchema.py
from carica.models import SerializableDataClass
from dataclasses import dataclass
@dataclass
class UserDataField(SerializableDataClass):
name: str
validation_regex: str
config.py
from carica.models import SerializablePath, SerializableTimedelta
from configSchema import UserDataField
from datetime import datetime
new_user_required_fields = [
UserDataField(
name = "user-name"
validation_regex = "[a-z]+"
),
UserDataField(
name = "password"
validation_regex = "\\b(?!password\\b)\\w+"
)
]
database_path = SerializablePath("default/path.csv")
birthday = datetime(day=1, month=1, year=1500)
connection_timeout = SerializableTimedelta(minutes=5)
- Preceeding comments: This functionality is 'complete' in that it functions as intended and passes all unit tests, however an issue needs to be worked aruond before the feature can be enabled: In order to disambiguate between variables and table fields, the TOML spec requires that arrays and tables be placed at the end of a document. Carica currently depends upon documents being rendered with variables appearing in the same order as they appear in the python config module, which is not guaranteed. This leads to trailing and otherwise misplaced preceeding comments.
- Config mutation: Carica should allow for loading an existing config, changing some values, and then updating the TOML document with new values. This should retain all formatting from the original document, including variable ordering and any comments that are not present in the python module.
-
No support for schema migration
-
No support for asynchronous object serializing/deserializing
-
Imperfect estimation of variables defined in python modules: Listing the variables defined within a scope is not a known feature of python, and so Carica estimates this information by iterating over the tokens in your module. Carica does not build an AST of your python module.
This means that certain name definition structures will result in false positives/negatives. This behaviour has not been extensively tested, but once such false positive has been identified:
When invoking a callable (such as a class or function) with a keyword argument on a new, unindented line, the argument name will be falsely identified as a variable name. E.g:
my_variable = dict(key1=value1, key2=value2)
produces
my_variable
andkey2
as variable names.