Huh really? It’s the exact opposite of my experience. I find gpt-5-high to be by far the most accurate of the models in following instructions over a longer period of time. Also much less prone to losing focus when context size increases
Are you using the -codex variants or the normal ones?
Are you using the -codex variants or the normal ones?