Your linked blog post "Abstraction and the Bayesian Occam's Razor" is very inter...

BenoitEssiambre · on Feb 20, 2024

Thanks for engaging with my wild ideas. I do think that's a good description yes. There are corollaries like Dependency Length Minimization I mentioned, which I think helps lower the MDL. Putting related things together allows using smaller symbols to describe things because more can be inferred from the adjacent context.

In my mind this is also related to concepts like scientific notation. Minimum description is not just about minimizing the number of operations, the number of branches, the degree of freedoms of your program. Taking into account variable width also makes a difference. Using a timestamp YYMMDDHHMMSS when you only need YYMMDD creates unintended edge cases the same way using too many digits in scientific notation creates noise and both can result in poor calibration with the uncertainty of the model.

shoo · on Feb 20, 2024

If you ever get the time & mental bandwidth do a second pass on your blog post, it could benefit from a worked example or two where you lay out sample code for the domain model & application logic & sample code for a test suite, along with their description length (measured in some MDL-appropriate sense), then illustrate how various changes to the design that plausibly make it better or worse at accommodating future change influence the description length.

One way to define a test suite that would be very easy for readers to understand could be to rig a test suite of "table-driven tests" for some toy problem.

I'm not sure what a good toy problem would be that would be small enough so it's relatively easy for readers to understand, but large enough to show off MDL being applicable to a more general kind of programming problem and not merely MDL being applied to a statistical modelling exercise (one person's decision tree fit via supervised learning is another person's handwritten function containing a morass of hand written if-else chains -- but we already expect MDL to give reasonable results when applied to decision trees, it's less obvious it gives good results when applied to more general programming problems)

c.f. https://dave.cheney.net/2019/05/07/prefer-table-driven-tests