Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LayoutLM [1] is the closest that I have seen you what you are asking. It is applied to documents but essentially takes positional and visual information into account for text extraction. For example, extracting a total from the line that reads TOTAL - I think this would be the best place to start.

1. https://arxiv.org/abs/2204.08387



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: