Contains documents intended to record insights and opinions on software evolution and architecture.
Sometimes the opinions take the form of source code files implementing a proposal.
- Code generation
- Code generation with support for detecting changes to generated code artificats. See https://github.com/aaronicsubstances/code-augmentor for details.
My ideal software architecture (which comprises code and data architectures) is one that meets the following criteria (in addition to all the excellent advice on Wikipedia):
- It has been designed with the necessary abstractions to ensure that its implementation can be changed incrementally, such that stakeholders and the public do not have to make radical changes to cope with any implementation changes.
- Engineers can discover the architecture on their own by reading the implementation.
- Its design makes it easier to transition to a distributed system.
I think that the three parameters of lines of code (if test code is included), number of developers, and time span for project completion can together be used by an organisation to determine how feasible it is to rewrite a module from scratch (part of which involves understanding the existing code). So once an organisation settles on values for these parameters, it can determine if a module can be easily rewritten from scratch or not.
I envision software as increasing in complexiy according to the following stages:
- one module, which can be easily rewritten from scratch.
- multiple modules, in which each module can be easily rewritten from scratch independent of the others, and each shared module consists of "relocatable source code files".
- A shared module consists of relocatable source code files if it can be duplicated under a different namespace or version, for use in the same process or machine together with the original.
- Concept of relocatable source code files is similar to concept of relocatable binary/object files, and is the key to enabling shared modules to be easily rewritten from scratch.
- one or more modules cannot be easily rewritten from scratch without a major software development effort.
- evolution to distributed system seems inevitable, because of pressure from external inputs (typical web requests) on network infrastructure, computer hardware and software resources and database storage facilities.
- distributed system involving multiple code deployments
I think that software belonging to the first two stages can dispense with serious software architecture considerations, because:
- The number of lines of code is small enough to be its own documentation, from which a software architecture can be extracted.
- It can always be rewritten from scratch as a last resort, if its architecture is found to be no longer satisfactory.
About abstractions that enables programmers to make incremental changes within the constraints of the laws of conservation of familiarity and organisational stability:
- build data architecture on the property-graph model, which is a near equivalent of entity-relational model as defined in https://github.com/aaronicsubstances/shrewd-evolver/blob/master/Data-Modelling.md
- This approach seeks to leverage the success story of SQL databases, whose flexibility come from multisets and the entity-relationship model.
- build code architecture on the assumption that all processing occurs similar to how Apache/PHP and AWS Lambda functions process HTTP requests: a single process/thread is created to handle an incoming HTTP request representing the input of I/O, and the output of I/O will be contained in the corresponding HTTP response.
- This approach seeks to leverage the fact that all I/O can be converted into network requests, ie from PCI Express to the Internet.
- Another takeaway is that if an architecture limits its use of memory to local variables, serializable memory, scheduled timeouts and I/O callbacks, then it will result in codebases which are structured in similar ways.
- present software to stakeholders and public as a single abstraction, regardless of the perspective programmers have of the software from its architecture.
- separate concerns of number of modules (aka microservices) from number of deployments, by deploying with single OS process (possibly supervisor process over child processes) for as long as possible.
- separate concerns of number of modules from number of databases, by managing data with ACID transactions inside single homogenous database for as long as possible.
About measures which make it easier to discover an architecture from its implementation:
- enforce code architecture by implementing modular boundaries with serializable, HTTP-like communication protocols. See https://github.com/aaronicsubstances/cskabomu for details.
- encode database schema and entity-relational or property-graph model into data storage in such a way that it can be extracted by database reverse-engineering tools.
- This is especially important for NoSQL databases which are usually without a database schema.
- For simple SQL designs, the database schema may approximate the entity-relational model.
About measures which make transition to distributed systems easier:
- deploy the code as two nearly-identical processes (or process groups).
- manage the data as two nearly-identical horizontal data partitions, but such that local ACID transactions can be conducted across them.