There are three ways 1. make your own RLHF dataset - like OpenAI and Open Assist... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		visarga on May 25, 2023 \| parent \| context \| favorite \| on: How to Finetune GPT-Like Large Language Models on ... There are three ways 1. make your own RLHF dataset - like OpenAI and Open Assistant 2. exfiltrate data from a bigger/better LLM - Vicuna & family 3. use your pre-trained LLM to generate RLAIF data, no leeching - ConstitutionalAI, based on a set of rules instead of labelling examples

cubefox on May 25, 2023 [–]

I wonder whether these approaches fit into the above categories:

https://arxiv.org/abs/2305.13735

https://arxiv.org/abs/2305.11206

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact