Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I've been playing with LLama 3.2 Vision 8B for such a use-case, and found it does a good job at providing image descriptions which could be indexed, along with transcription of any text in the image, such as the name on the grave in this case.

So should be possible to have a similar capability locally now.



> LLama 3.2 Vision 8B

Brainfart, that should of course be LLama 3.2 Vision 11B. Keep mixing up the model sizes.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: