Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

A generative operating system that directly predicts screen images based on mouse and keyboard inputs, powered by an RNN for state modeling and a diffusion model for image generation.

See my tweet for more details: https://x.com/yuntiandeng/status/1944802154314916331



i like how most of your demo video is clicking through various firefox and google popups.


Pretty realistic, actually.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: