Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Policy tuning is one way to remove it, but there's some research on applying policies like "this text is a lie" during pretraining instead of trusting all of it equally.

Who's gonna vet all the training material?



The previous model does.

https://arxiv.org/abs/2309.00267

That's during fine-tuning though. I know I read one about applying it during pretraining, maybe this one?

https://arxiv.org/abs/2302.08582




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: