Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

If you consider the entire input for the language model as the state, and consider the output to be the input concatenated with the next token, then it's a markov chain.

But that's only if you don't use BEAM search, which LLMs do use. If you use BEAM search with beam=4 (typical for LLMs), for example, then it does a tree search where it keeps track of the top 4 highest probability outputs, then I guess it's not a markov chain process anymore but it just uses a markov chain.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: