Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

We're talking miraculous level of improvement for a SOA LLM to run on a phone without crushing battery life this decade.

People are missing the forest for the trees here. Being the go to consumer Gen AI is a trillion+ dollar business. How many 10s of billions you waste on building unnecessary data centers is a rounding error. The important number is your odds of becoming that default provider in the minds of consumers.



It’s extremely easy to switch.

I used ChatGPT for every day stuff, but in my experience their responses got worse and I had to wait much longer to get them. I switched to Gemini and their answers were better and were much faster.

I don’t have any loyalty to Gemini though. If it gets slow or another provider gives better answers, I’ll change. They all have the same UI and they all work the same (from a user’s perspective).

There is no moat for consumer genAI. And did I mention I’m not paying for any of it?

It’s like quick commerce, sure it’s easy to get users by offering them something expensive off of VC money. The second they raise prices or offer degraded experience to make the service profitable, the users will leave for another alternative.


> The important number is your odds of becoming that default provider in the minds of consumers.

I haven't seen any evidence that any Gen AI provider will be able to build a moat that allows for this.

Some are better than others at certain things over certain time periods, but they are all relatively interchangeable for most practical uses and the small differences are becoming less pronounced, not more.

I use LLMs fairly frequently now and I just bounce around between them to stay within their free tiers. Short of some actual large breakthrough I never need to commit to one, and I can take advantage of their own massive spends and wait it out a couple of years until I'm running a local model self-hosted with a cloudflare tunnel if I need to access it on my phone.

And yes, most people won't do that, but there will be a lot of opportunity for cheap providers to offer that as a service with some data center spend, but nowhere near the massive amounts OpenAI, Google, Meta, et al are burning now.


The moat will be memory.

As a regular user, it becomes increasingly frustrating to have to remind each new chat “I’m working on this problem and here’s the relevant context”.

GenAI providers will solve this, and it will make the UX much, much smoother. Then they will make it very hard to export that memory/context.

If you’re using a free tier I assume you’re not using reasoning models extensively, so you wouldn’t necessarily see how big of a benefit this could be.


The moat is the chat history and the flywheel of user feedback improving the product.


Given how often smaller LLM companies train on the output of bigger LLM companies, it's not much of a moat.

LLMs complete text. Every query they answer is giving away the secret ingredient in the shape of tokens.


They all offer some "memory" cross chat now and they're all more annoying than helpful. Not really compelling. You can pretty easily export your chat if you want.


Just build more tulip fields/railways/websites no matter the cost - the golden age is inevitable! ;-)


> The important number is your odds of becoming that default provider in the minds of consumers.

Markets that have a default provider are basically outliers (desktop OS, mobile OS, search, social networks, etc).

All other industries don't have a single dominant supplier who is the default provider.

I am skeptical that this market is going to be one of the outliers.

All the other markets with a default provider basically rely on network effects to become and remain the default provider.

There is nothing here (in this market) that relies on network effects.


In fact it's apparently $5.2 trillion by 2030 [0] (out of $6.7T total data center spend; meaning all of "traditional IT needs" are less than a quarter of the total). That's the total if you add up all of the firms chasing this opportunity.

I do wonder, if you (and the commenter you replied to) think this is a good thing, will you be OK with a data center springing up in your neighbourhood, driving up water or power prices, emitting CO2? Then if SOTA LLMs become efficient enough to run on a smartphone will you be OK with a data center bailout coming from your tax dollars?

[0]: https://www.mckinsey.com/industries/technology-media-and-tel...


How big does an LLM need to be to support natural language queries with RAG?


My hot (maybe just warm these days) take is, the problem with voice assistants on phones is they have to be able to have reasonable responses to a long tail or users will learn not to use them, since the use cases aren’t discoverable and the primarily value is talking to it like a person.

So voice assistants backed by very large LLMs over the network are going to win even if we solve the (substantial) battery usage issue.


Why even bother with the text generation then? You could just make a phone call to an LLM with a TTS frontend. Like with directory enquiries back in the day. Which can be set up as easily as a BBS if you have a home server rack like Jeff Geerling makes youtube videos about.


1 800 CHAT GPT exists, I use it often enough to put it on speed dial on my flip phone


> Being the go to consumer Gen AI is a trillion+ dollar business.

I use LLM’s all day and a highly doubt this. I’d love to hear your argument for how this plays out.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: