Honestly though, hundreds of billions of tokens per week really isn't that much. My tiny little profitable SaaS business that can't even support my family yet is doing 10-20 billion tokens per month on Gemini Flash 2.5.
Looks like over the last month just Deepseek, Qwen and Z-AI did about 2.8 trillion tokens, given your metric the equivalent to about 187 tiny little profitable SaaS businesses, and that's only those who go through OpenRouter. To me that's very significant.
Also, congrats on the traction ! Being profitable enough to support a family is 95% area-CoL and family size so not sure about that one, but if you're doing that many tokens you've clearly got a good number of active users. We're at a similar point but only 100-200 million tokens per month, strictly B2C app though so that might explain it, tends to be less token heavy.
2.5 Flash is still fantastic especially if you're really input heavy, we use it too for many things, but we've found several open weights models to have better price/quality for certain tasks. It's nice that 2.5 Flash is fast but then speed is most important for longer outputs and for those Flash is relatively expensive. DeepSeek v3.1 is all-around cheaper, for one example.