IMO LLM more or less literally cannot do what they do without a world model, not least because much of what language is, is a protocol for making assertions about that model, testing the degree to which it is shared, and seeking to alter the model one carries of one's interlocutor's model.
To the "parrot people" I suggest, there is no more optimized mechanism for the inner layers of a network to approach than one which most parsimoniously models the world, so as to correctly emit tokens reflective of that.
IMO LLM more or less literally cannot do what they do without a world model, not least because much of what language is, is a protocol for making assertions about that model, testing the degree to which it is shared, and seeking to alter the model one carries of one's interlocutor's model.
To the "parrot people" I suggest, there is no more optimized mechanism for the inner layers of a network to approach than one which most parsimoniously models the world, so as to correctly emit tokens reflective of that.