The point is the opportunity created by trump's tariff policy. Saying do what I want or I'll burn your house down, and then burning your house down - you no longer need to do what is demanded. An opportunity has appeared.
There are two things happening here. A really small LLM mechanism which is useful for thinking about how the big ones work, and a reference to the well known phenomenon, commonly dismissively referred to as a "trick", in which humans want to believe. We work hard to account for what our conversational partner says. Language in use is a collective cultural construct. By this view the real question is how and why we humans understand an utterance in a particular way. Eliza, Parry, and the Chomsky bot at http://chomskybot.com work on this principle. Just sayin'.
Fair. The background reading is the EMCA stuff - conversation analysis cf Sacks etc at, and Ethnomethods (Garfunkel). And Vygotsky cf Kozulin. People such as Robert Moore at IBM and Lemon at Herriot-Watt work in this space but there is no critical mass in the face of LLM mania.
Programming languages were originally designed by mathematicians based on a Turing machine. A modern language for FPGAs is a challenge for theoretical computer science, but we should keep computer literate researchers away from it. This is a call out to hard core maths heads to think about how we should think about parallelism and what FPGA hardware can do.
The real reason it won't end up in a park is not the engineering. The problem is the same one as NPCs in computer games: synthetic characters are, to date, just really transparent and boring. The real research question is why.
I guess that's why most computer games don't have NPCs...Oh wait there's entire computer games built entirely around interacting with synthetic NPCs.
There are, of course, limitations to synthetic characters. Even with those limitations there are plenty of entertaining experiences to crafted.
The real challenges are around maintaining and safely operating automous robots around children in a way that isn't too expensive. These constraints place far more limits than those on synthetic characters in video games.
Most people aren't paying 100s or 1000s of dollars to interact with NPCs in video games. If they were, they'd probably expect a lot more and get bored of it quicker.
> The real challenges are around maintaining and safely operating automous robots around children in a way that isn't too expensive.
This is one of the challenges, but only one. The one GP outlined is still very much real - see the Defunctland video on Living Characters for some older examples, but for a recent example, there's the DS-09 droid from Galactic Starcruiser.
There is an argument, perhaps no longer PC, that the indigenous population used fire to hunt, and so burnt off regularly. Fires these days are indeed devastating because we try to stop them. Established eucalyptus trees also thrive after a scrub fire; a "devastating" fire kills them.
Just want to reassure you that is not at all 'no longer PC'. If anything, the practice was banned by the coloniser - only for it more recently reintroduced.
I caved in South Australia many years ago and what is not obvious is that south Australian caves are predominantly horizontal and dry. I now live in the Peak District in the UK and caving is a muddy wet affair with too many ropes.
So, are current LLMs better because artificial neural networks are better predictors than Markov models, or because of the scale of the training data? Just putting it out there..
Markov models usually only predict the next token given the two preceding tokens (trigram model) because the data gets so exceptionally sparse beyond that, that it becomes impossible to make probability estimations (despite back-off, smoothing, etc.).
I recommend you to read Bengio et al.’s 2003 paper which describes this issue in more detail and introduces distributional representations (embeddings) in an RNN to avoid this sparsity.
While we are using transformers and sentence pieces now, this paper aptly describes the motivation underpinning modern models.
> Markov models usually only predict the next token given the two preceding tokens (trigram model) because the data gets so exceptionally sparse beyond that
Of course, that's because it is a probability along a single dimension with a chain-length along that one dimension while LLMs and NNs use multiple dimensions (They are meshed, not chained).
I really want to know what the result would look like with a few more dimensions resulting in a markov mesh type structure rather than a chain structure.
Thanks for the reference and I stand corrected. And yes I had looked at it a long time ago and will give it another read. But I think it is saying that RNNs are a means of approximating a statistical property of a collection of text. That property is what we today think of as "completion"? That is, glorified auto complete, and not "distributed representations" of the world. Would you agree?
This is the problem. I am arguing there are no distributed representations (cf Harnad's original symbol grounding problem paper, Hinton, and others). There are "distributional representations" by definition (cf the Wikipedia entry)
reply