So I'm hearing a lot of people running LLMs on Apple hardware. But is there actu...

anonzzzies · 2025-12-24T02:59:43 1766545183

I was sitting in an airplane next to a guy on a MacBook pro something who was coding in cursor with a local llm. We got talking and he said there are obviously differences but for his style of 'English coding' (he described basically what code to write/files to change but in english, but more sloppy than code obviously otherwise he would just code) it works really well. And indeed that's what he could demo. The model (which was the OSS gpt i believe) did pretty well in his nextjs project and fast too.

sueders101 · 2025-12-24T02:50:49 1766544649

I've tried out gpt-oss:20b on a MacBook Air (via Ollama) with 24GB of RAM. In my experience it's output is comparable to what you'd get out of older models and the openAI benchmarks seem accurate https://openai.com/index/introducing-gpt-oss/ . Definitely a usable speed. Not instant, but ~5 tokens per second of output if I had to guess.

seanmcdirmid · 2025-12-23T23:54:57 1766534097

I have an MBP Max M3 with 64GB of RAM, and I can run a lot at useful speed (LLMs run fine, diffusion image models run OK although not as fast as they would on a 3090). My laptop isn't typical though, it isn't a standard MBP with a normal or pro processor.

fhsm · 2025-12-23T22:43:24 1766529804

This paper shows a use case running on Apple silicon that’s theoretically valuable:

https://pmc.ncbi.nlm.nih.gov/articles/PMC12067846/

Who cares if result is right / wrong etc as it will all be different in a year … just interesting to see a test of desktop class hardware go ok.

jki275 · 2025-12-23T22:35:10 1766529310

I can definitely write code with a local model like Devstral small or a quantized granite, or a quantized deep-seek on an M1 Max w/ 64gb of ram.

DANmode · 2025-12-23T22:06:42 1766527602

Of course it depends what you’re doing.

Do you work offline often?

Essential.