I've been playing with LLama 3.2 Vision 8B for such a use-case, and found it does a good job at providing image descriptions which could be indexed, along with transcription of any text in the image, such as the name on the grave in this case.
So should be possible to have a similar capability locally now.
So should be possible to have a similar capability locally now.