Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It kind of is expected, right? If a 70B model can have great overall performance, a 1B model focused on coding and a single language could even be comparable.

I am actually hoping we see more per language models soon, though obviously, it can be as "smart" if trained only on a single language.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: