Your linked blog post "Abstraction and the Bayesian Occam's Razor" is very interesting. I'll play back my understanding to you, to see if I'm approximately following and summarising your thesis.
Context:
When programming we attempt to design an effective abstraction that models some domain. When designing this abstraction, there are trade-offs between reducing the amount of code required, enabling reuse, reducing coupling, flexibility to accommodate future use cases.
Key Problem:
How do we design an abstraction for our domain model?
Claim 1:
Apply the "Minimum Description Length" (MDL) model selection principle: prefer a domain model embedded in the shortest program able to recreate the dataset of domain knowledge.
Applying MDL model selection will result in an abstraction for the domain model that is both smaller -- giving less code to maintain -- and more likely to generalize to future unknown use cases.
Complication:
Applying the MDL model selection principle relies on having access to a dataset of domain knowledge. We can think of this dataset of domain knowledge as a list of (situation, expected behaviour) pairs -- c.f. a labelled supervised learning dataset, or a gigantic list of requirements. Unfortunately, in typical software projects, no such explicit dataset cataloguing the requirements or expected behaviour in each situation exists.
Claim 2:
We can use the automated test suite as a proxy for the dataset of domain knowledge. When designing our abstraction we should prefer a domain model where the combined size of the logic for the domain model and the size of the corresponding test suite* is minimal.
* with the important caveat that "just cutting out the tests, or removing other safeties like strict types doesn't give you a lower MDL, in that case, you're missing the descriptions of important parts of your data or knowledge".
Thanks for engaging with my wild ideas. I do think that's a good description yes. There are corollaries like Dependency Length Minimization I mentioned, which I think helps lower the MDL. Putting related things together allows using smaller symbols to describe things because more can be inferred from the adjacent context.
In my mind this is also related to concepts like scientific notation. Minimum description is not just about minimizing the number of operations, the number of branches, the degree of freedoms of your program. Taking into account variable width also makes a difference. Using a timestamp YYMMDDHHMMSS when you only need YYMMDD creates unintended edge cases the same way using too many digits in scientific notation creates noise and both can result in poor calibration with the uncertainty of the model.
If you ever get the time & mental bandwidth do a second pass on your blog post, it could benefit from a worked example or two where you lay out sample code for the domain model & application logic & sample code for a test suite, along with their description length (measured in some MDL-appropriate sense), then illustrate how various changes to the design that plausibly make it better or worse at accommodating future change influence the description length.
One way to define a test suite that would be very easy for readers to understand could be to rig a test suite of "table-driven tests" for some toy problem.
I'm not sure what a good toy problem would be that would be small enough so it's relatively easy for readers to understand, but large enough to show off MDL being applicable to a more general kind of programming problem and not merely MDL being applied to a statistical modelling exercise (one person's decision tree fit via supervised learning is another person's handwritten function containing a morass of hand written if-else chains -- but we already expect MDL to give reasonable results when applied to decision trees, it's less obvious it gives good results when applied to more general programming problems)
Context:
When programming we attempt to design an effective abstraction that models some domain. When designing this abstraction, there are trade-offs between reducing the amount of code required, enabling reuse, reducing coupling, flexibility to accommodate future use cases.
Key Problem:
How do we design an abstraction for our domain model?
Claim 1:
Apply the "Minimum Description Length" (MDL) model selection principle: prefer a domain model embedded in the shortest program able to recreate the dataset of domain knowledge.
Applying MDL model selection will result in an abstraction for the domain model that is both smaller -- giving less code to maintain -- and more likely to generalize to future unknown use cases.
Complication:
Applying the MDL model selection principle relies on having access to a dataset of domain knowledge. We can think of this dataset of domain knowledge as a list of (situation, expected behaviour) pairs -- c.f. a labelled supervised learning dataset, or a gigantic list of requirements. Unfortunately, in typical software projects, no such explicit dataset cataloguing the requirements or expected behaviour in each situation exists.
Claim 2:
We can use the automated test suite as a proxy for the dataset of domain knowledge. When designing our abstraction we should prefer a domain model where the combined size of the logic for the domain model and the size of the corresponding test suite* is minimal.
* with the important caveat that "just cutting out the tests, or removing other safeties like strict types doesn't give you a lower MDL, in that case, you're missing the descriptions of important parts of your data or knowledge".