how do apis typically manage to actually « use » the « bar » of your example, such as storing it somewhere, without enforcing some kind of constraints ?
Depending on exactly what you mean, this isn't correct. This syntax is the same as <T: BarTrait>, and you can store that T in any other generic struct that's parametrized by BarTrait, for example.
> you can store that T in any other generic struct that's parametrized by BarTrait, for example
Not really. You can store it on any struct that specializes to the same type of the value you received. If you get a pre-built struct from somewhere and try to store it there, your code won't compile.
I'm addressing the intent of the original question.
No one would ask this question in the case where the struct is generic over a type parameter bounded by the trait, since such a design can only store a homogeneous collection of values of a single concrete type implementing the trait; the question doesn't even make sense in that situation.
The question only arises for a struct that must store a heterogeneous collection of values with different concrete types implementing the trait, in which case a trait object (dyn Trait) is required.
An ecosystem is being built around AI : Best prompting practices, mcps, skills, IDE integration, how to build a feedback loop so that LLM can test its output alone, plug to the outside world with browser extensions, etc...
For now i think people can still catch up quickly, but at the end of 2026 it's probably going to be a different story.
> Best prompting practices, mcps, skills, IDE integration, how to build a feedback loop so that LLM can test its output alone, plug to the outside world with browser extensions, etc...
Ah yes, an ecosystem that is fundamentally inherently built on probabilisitic quick sand and even with the "best prompting practices", you still get agents violating the basics of security and committing API keys when they were told not to. [0]
For example, what if your code (that the LLM hasn't reviewed yet) has a dumb feature in where it dumps environment variables to log output, and the LLM runs "./server --log debug-issue-144.log" and commits that log file as part of a larger piece of work you ask it to perform.
If you don't want a bad thing to happen, adding a deterministic check that prevents the bad thing to happen is a better strategy than prompting models or hoping that they'll get "smarter" in the future.
Part of why these things feel "not fit for purpose" is that they don't include the things Simon has spent three years learning? (I know someone else who's doing multi-LLM development where he uses job-specialty descriptions for each "team member" that lets them spend context on different aspects of the problem; it's a fascinating exercise to watch, but it feels even more like "if this is how the tools should be used, why don't they just work that way"?)
I have tons of examples of AI not committing secrets. this is one screenshot from twitter? I don’t think it makes your point
CPUs are billions of transistors. sometimes one fails and things still work. “probabilistic quicksand” isn’t the dig you think it is to people who know how this stuff works
> I have tons of examples of AI not committing secrets.
"Trust only me bro".
It takes 10 seconds to see the many examples of API keys + prompts on GitHub to verify that tweet. The issue with AI isn't limited to that tweet which demonstrates its probabilistic nature; Otherwise why do need a sandbox to run the agent in the first place?
Nevermind, we know why: Many [0] such [1] cases [2]
> CPUs are billions of transistors. sometimes one fails and things still work. “probabilistic quicksand” isn’t the dig you think it is to people who know how this stuff works
Except you just made a false equivalence. CPUs can be tested / verified transparently and even if it does go wrong, we know exactly why. Where as you can't explain why the LLM hallucinated or decided to delete your home folder because the way it predicts what it outputs is fundamentally stochastic.
you could find tons of API keys on GitHub before these “agentic” tools too. that was my point, one screenshot from twitter vs one anecdote from me. I don’t think either proves the point, but posting a screenshot from twitter like it’s proof of some widespread problem is what I was responding to (N=2, 1 vs 1)
my point is more “skill issue” than “trust me this never happens”
my point on CPUs is people who don’t understand LLMs talk like “hallucinations” are a real thing — LLMs are “deciding” to make stuff up rather than just predicting the next token. yes it’s probabilistic, so is practically everything else at scale. yet it works and here we are. can you really explain in detail how everything you use works? I’m guessing I can explain failure modes of agentic systems (and how to avoid them so you don’t look silly on twitter/github) and how neural networks work better than most people can explain the technology they use every day
> you could find tons of API keys on GitHub before these “agentic” tools too. that was my point, one screenshot from twitter vs one anecdote from me. I don’t think either proves the point, but posting a screenshot from twitter like it’s proof of some widespread problem is what I was responding to (N=2, 1 vs 1)
That doesn't refute the probabilistic nature of LLMs despite best prompting practices. In fact it emphasises it. More like your 1 anecdotal example vs my 20+ examples on GitHub.
My point tells you that not only it indeed does happen, but a previous old issue is now made even worse and more widespread, since we now have vibe-coders without security best practices assuming the agent should know better (when it doesn't).
> my point is more “skill issue” than “trust me this never happens”
So those that have this "skill issue" are also those who are prompting the AI differently then? Either way, this just inadvertently proves my whole point.
> yes it’s probabilistic, so is practically everything else at scale. yet it works and here we are.
The additional problem is can you explain why it went wrong as you scale the technology? CPUs circuit design go through formal verification and if a fault happens, we know exactly why; hence it is deterministic in design which makes them reliable.
LLMs are not and don't have this. Which is why OpenAI had to describe ChatGPT's misaligned behaviour as "sycophancy", but could not explain why it happened other than tweaking the hyper-parameters which got them that result.
So LLMs being fundamentally probabilistic and are hence, more unexplainable being the reason why you have the screenshot of vibe-coders who somehow prompted it wrong and the agent committed the keys.
Maybe that would never have happened to you, but it won't be the last time we see more of this happening on GitHub.
I was pointing out one screenshot from twitter isn’t proof of anything just to be clear; it’s a silly way to make a point.
yes AI makes leaking keys on GH more prevalent, but so what? it’s the same problem as before with roughly the same solution
I’m saying neural networks being probabilistic doesn’t matter — everything is probabilistic. you can still practically use the tools to great effect, just like we use everything else that has underlying probabilities
OpenAI did not have to describe it as sycophancy, they chose to, and I’d contend it was a stupid choice
and yes, you can explain what went wrong just like you can with CPUs. we don’t (usually) talk about quantum-level physics when discussing CPUs; talking about neurons in LLMs is the wrong level of abstraction
> I was pointing out one screenshot from twitter isn’t proof of anything just to be clear; it’s a silly way to make a point.
Verses your anecdote being a proof of what? Skill issue for vibe coders? Someone else prompting it wrong?
You do realize you are proving my entire point?
> yes AI makes leaking keys on GH more prevalent, but so what? it’s the same problem as before with roughly the same solution
Again, it exacerbates my point such that it makes the existing issue even worse. Additionally, that wasn't even the only point I made on the subject.
> I’m saying neural networks being probabilistic doesn’t matter — everything is probabilistic.
When you scale neural networks to become say, production-grade LLMs, then it does matter. Just like it does matter for CPUs to be reliable when you scale them in production-grade data centers.
But your earlier (fallacious) comparison ignores the reliability differences between them (CPUs vs LLMs.) and determinism is a hard requirement for that; which the latter, LLMs are not.
> OpenAI did not have to describe it as sycophancy, they chose to, and I’d contend it was a stupid choice
For the press, they had to, but no-one knows the real reason, because it is unexplainable; going back to my other point on reliability.
> and yes, you can explain what went wrong just like you can with CPUs. we don’t (usually) talk about quantum-level physics when discussing CPUs; talking about neurons in LLMs is the wrong level of abstraction
It is indeed wrong for LLMs because not even the researchers can practically give an explanation why a single neuron (for every neuron in the network) gives different values on every fine-tune or training run. Even if it is "good enough", it can still go wrong at the inference-level for other unexplainable reasons other than it "overfitted".
CPUs on the other hand, have formal verification methods which verify that the CPU conforms to its specification and we can trust that it works as intended and can diagnose the problem accurately without going into atomic-level details.
No one is arguing that it isn't useful. The problem is this:
> I’m saying it doesn’t matter it’s probabilistic, everything is,
Maybe it doesn't matter for you, but it generally does matter.
The risk level of a technology failing is far higher if it is more random and unexplainable than if it is expected, verified and explainable. The former eliminates many serious use-cases.
This is why your CPU, or GPU works.
LLMs are neither deterministic, no formal verification exists and are fundamentally black-boxes.
That is why many vibe-coders reported many "AI deleted their entire home folder" issues even when they told it to move a file / folder to another location.
If it did not matter, why do you need sandboxes for the agents in the first place?
I think we agree then? the tech is useful; you need systems around them (like sandboxes and commit hooks that prevent leaking secrets) to use them effectively (along with learned skills)
very little software (or hardware) used in production is formally verified. tons of non-deterministic software (including neural networks) are operating in production just fine, including in heavily regulated sectors (banking, health care)
> I think we agree then? the tech is useful; you need systems around them (like sandboxes and commit hooks that prevent leaking secrets) to use them effectively (along with learned skills)
No.
> very little software (or hardware) used in production is formally verified. tons of non-deterministic software (including neural networks) are operating in production just fine, including in heavily regulated sectors (banking, health care)
It's what happens when it all goes wrong.
You have to explain exactlywhy, a system failed in heavily regulated sectors.
Saying 'everything is probabilistic' as the reason for the cause of an issue, is a non answer if you are a chip designer, air traffic controller, investment banker or medical doctor.
that’s not what I said. you honestly seem like you just want to argue about stuff (e.g. not elaborating on the “no” when I basically repeated and agreed with what you said). and you seem to consistently miss my point (in the second part of your response; I’m saying these non-deterministic neural networks are already widespread in industry with these regulations, and it’s fine. they can be explained despite your repeated assertions they cannot be. also the entire point on CPUs which you may have noticed I dropped from my responses because you seemed distracted arguing about it). this is not productive and we’re both clearly stubborn, glhf
> that’s not what I said. you honestly seem like you just want to argue about stuff (e.g. not elaborating on the “no” when I basically repeated and agreed with what you said). and you seem to consistently miss my point
I have repeated myself many times and you decide to continue to ignore the reliability points that inherently impede LLMs in many use-cases which exclude them in areas where predictability in critical systems is required in production.
Vibe coders can use them, but the gulf between useful for prototyping and useful for production is riddled with hard obstacles as such a software like LLMs are fundamentally unpredictable hence the risks are far greater.
> I’m saying these non-deterministic neural networks are already widespread in industry with these regulations, and it’s fine.
So when a neural network scales beyond hundreds of layers and billions of parameters, equivalent to a production-grade LLM, explain exactly how is such a black-box on that scale explainable when it messes up and goes wrong?
> they can be explained despite your repeated assertions they cannot be.
With what methods exactly?
Early on, I said formal verification and testing on CPUs for explaining when they go wrong at scale. It is you that provided absolutely nothing of your own assertions with the equivalent for LLMs other than "they can be explained" without providing any evidence.
> also the entire point on CPUs which you may have noticed I dropped from my responses because you seemed distracted arguing about it). this is not productive and we’re both clearly stubborn, glhf
You did not make any point with that as it was a false equivalence, and I explained why the reliability of a CPU isn't the same as the reliability of a LLM.
Isn't "unambiguous representation" impossible in practice anyway ? Any representation is relative to a formal system.
I can define sqrt(5) in a hard-coded table on a maths program using a few bytes, as well as all the rules for manipulating it in order to end up with correct results.
Well yeah but if we’re being pedantic anyway then “render these bits in UTF-8 in a standard font and ask a human what number it makes them think of” is about as far from an unambiguous numerical representation as you could get.
Of course if you know that you want the square root of five a priori then you can store it in zero bits in the representation where everything represents the square root of five. Bits in memory always represent a choice from some fixed set of possibilities and are meaningless on their own. The only thing that’s unrepresentable is a choice from infinitely many possibilities, for obvious reasons, though of course the bounds of the physical universe will get you much sooner.
Only if you've added a signing certificate the VPN controls to your CA chain. But at that point they don't have to do anything as complicated as you described.
TLS means “there’s a certificate”. Yeah, if a VPN/proxy can forge a certificate that the user’s browser would trust, it’s an issue.
But considering those are browser extensions, I think they can just inspect any traffic they want on the client side (if they can get such broad permissions approved, which is probably not too hard).
as i know really nothing about the subject, could someone explain why parent was downvoted ? is it for the tone, or the content ? Because, i , having viewed the youtubers in question, had the same opinion about string theory.
The word 'falsifiable' comes from Popper's criterion, which is central to scientific methodology. What it means: if theory predicts something, and later observations show that prediction doesn't hold, then the theory is incorrect.
String theory doesn't work this way, whatever was measured will be explained as an afterthought by free parameter tuning.
Do you mean that have been falsified? Of course, no standing theory delivers falsified predictions, when that happens you throw the theory in the garbage.
Do you mean that can be falsified in principle? In that case String Theory has falsifiable predictions, I gave you one. In principle, we can make experiment that would falsify special relativity. In fact, we've made such experiments in the past and those experiments have never seen special relativity being violated. The test of special relativity are the most precise tests existing in science.
I suspect what they mean is that there is no outcome of an experiment such that, prior to the experiment, people computed that string theory says that the experiment should have such a result, but our other theories in best standing would say something else would happen, and then upon doing the experiment, it was found that things happened the way string theory said (as far as measurements can tell).
But there are such experiments. String theory says that the result of such experiment is: Lorentz invariance not violated.
> but our other theories
This is not how scientific research is done. The way you do it is you a theory, the theory makes predictions, you make experiments, and the predictions fail, you reject that theory. The fact that you might have other theories saying other things doens't matter for that theory.
So string theories said "Lorentz invariance not violated", we've made the experiments, and the prediction wasn't wrong, so you don't reject the theory. The logic is not unlike that of p-testing. You don't prove a theory correct is the experiments agree with it. Instead you prove it false if the experiments disagree with it.
There are no such experimental results satisfying the criteria I laid out. You may be right in objecting to the criteria I laid out, but, the fact remains that it does not satisfy these (perhaps misguided) criteria.
In particular, predicting something different from our best other theories in good standing, was one of the criteria I listed.
And, I think it’s pretty clear that the criteria I described, whether good or not, were basically what the other person meant, and should have been what you interpreted them as saying, not as them complaining that it hadn’t been falsified.
Now, when we gain more evidence that Lorentz invariance is not violated, should the probability we assign to string theory being correct, increase? Yes, somewhat. But, the ratio that is the probability it is correct divided by the probability of another theory we have which also predicts Lorentz invariance, does not increase. It does not gain relative favor.
Now, you’ve mentioned a few times, youtubers giving bad arguments against string theory, and people copying those arguments. If you’re talking about Sabine, then yeah, I don’t care for her either.
However, while the “a theory is tested on its own, not in comparison to other theories” approach may be principled, I’m not sure it is really a totally accurate description of how people have evaluated theories historically.
> But there are such experiments. String theory says that the result of such experiment is: Lorentz invariance not violated.
This is not a new prediction... String theory makes no new predictions, I hear. I don't understand why you need to be told this.
To your point, there exist various reformulations of physics theories, like Lagrangian mechanics and Hamiltonian mechanics, which are both reformulations of Newtonian mechanics. But these don't make new predictions. They're just better for calculating or understanding certain things. That's quite different from proposing special relativity for the first time, or thermodynamics for the first time, which do make novel predictions compared to Newton.
It has delivered falsifiable postdictions though. Like, there are some measurable quantities which string theory says must be in a particular (though rather wide) finite range, and indeed the measured value is in that range. The value was measured to much greater precision than that range before it was shown that string theory implies the value being in that range though.
Uh, iirc . I don’t remember what value specifically. Some ratio of masses or something? Idr. And I certainly don’t know the calculation.
Curious : do you have a team of people working with you, or is it mostly solo work ? your work is so valuable, i would be scared for our industry if it had a bus factor of 1.
That's an interesting hypothesis : that LLM are fundamentally unable to produce original code.
Do you have papers to back this up ? That was also my reaction when i saw some really crazy accurate comments on some vibe coded piece of code, but i couldn't prove it, and thinking about it now i think my intuition was wrong (ie : LLMs do produce original complex code).
We can solve that question in an intuitive way: if human input is not what is driving the output then it would be sufficient to present it with a fraction of the current inputs, say everything up to 1970 and have it generate all of the input data from 1970 onwards as output.
If that does not work then the moment you introduce AI you cap their capabilities unless humans continue to create original works to feed the AI. The conclusion - to me, at least - is that these pieces of software regurgitate their inputs, they are effectively whitewashing plagiarism, or, alternatively, their ability to generate new content is capped by some arbitrary limit relative to the inputs.
This is known as the data processing inequality. Non-invertible functions can not create more information than what is available in their inputs: https://blog.blackhc.net/2023/08/sdpi_fsvi/. Whatever arithmetic operations are involved in laundering the inputs by stripping original sources & references can not lead to novelty that wasn't already available in some combination of the inputs.
Neural networks can at best uncover latent correlations that were already available in the inputs. Expecting anything more is basically just wishful thinking.
Using this reasoning, would you argue that a new proof of a theorem adds no new information that was not present in the axioms, rules of inference and so on?
If so, I'm not sure it's a useful framing.
For novel writing, sure, I would not expect much truly interesting progress from LLMs without human input because fundamentally they are unable to have human experiences, and novels are a shadow or projection of that.
But in math – and a lot of programming – the "world" is chiefly symbolic. The whole game is searching the space for new and useful arrangements. You don’t need to create new information in an information-theoretic sense for that. Even for the non-symbolic side (say diagnosing a network issue) of computing, AIs can interact with things almost as directly as we can by running commands so they are not fundamentally disadvantaged in terms of "closing the loop" with reality or conducting experiments.
Sound deductive rules of logic can not create novelty that exceeds the inherent limits of their foundational axiomatic assumptions. You can not expect novel results from neural networks that exceed the inherent information capacity of their training corpus & the inherent biases of the neural network (encoded by its architecture). So if the training corpus is semantically unsound & inconsistent then there is no reason to expect that it will produce logically sound & semantically coherent outputs (i.e. garbage inputs → garbage outputs).
Maybe? But it also seems like you are that you are not accounting for new information at inference time. Let's pretend I agree the LLM is a plagiarism machine that can produce no novelty in and of itself that didn't come from what it was trained on, and produces mostly garbage (I only half agree lol, and I think "novelty" is under-specified here).
When I apply that machine (with its giant pool of pirated knowledge) _to my inputs and context_ I can get results applicable to my modestly novel situation which is not in the training data. Perhaps the output is garbage. Naturally if my situation is way out of distribution I cannot expect very good results.
But I often don't care if the results are garbage some (or even most!) of the time if I have a way to ground-truth whether they are useful to me. This might be via running a compile, a test suite, a theorem prover or mk1 eyeball. Of course the name of the game is to get agents to do this themselves and this is now fairly standard practice.
I'm not here to convince you whether Markov chains are helpful for your use cases or not. I know from personal experience that even in cases where I have a logically constrained query I will receive completely nonsensical responses¹.
It does this all the time, but as often as not then outputs nonsense again, just different nonsense, and if you keep it running long enough it starts repeating previous errors (presumably because some sliding window is exhausted).
That's been my general experience and that was the most recent example. People keep forgetting that unless they can independently verify the outputs they are essentially paying OpenAI for the privilige of being very confidently gaslighted.
It would be a really nice exercise - for which I unfortunately do not have the time - to have a non-trivial conversation with the best models of the day and then to rigorously fact-check every bit of output to determine the output quality. Judging from my own (probably not a representative sample) experience it would be a very meager showing.
I use AI as a means of last resort only now and then mostly as a source of inspiration rather than a direct tool aiming to solve an issue. And like that it has been useful on occasion, but it has at least as often been a tremendous waste of time.
Theoretical "proofs" of limitations like this are always unhelpful because they're too broad, and apply just as well to humans as they do to LLMs. The result is true but it doesn't actually apply any limitation that matters.
You're confused about what applies to people & what applies to formal systems. You will continue to be confused as long as you keep thinking formal results can be applied in informal contexts.
That's true. But if we take your implied rebuttal then current level AI would be able to learn from current AI as well as it would learn from humans, just like humans learn from other humans. But so far that does not seem to be the case, in fact, AI companies do everything they can to avoid eating their own tail. They'd love eating their own tail if it was worth it.
To me that's proof positive they know their output is mangled inputs, they need that originality otherwise they will sooner or later drown in nonsense and noise. It's essentially a very complex game of Chinese whispers.
I think my track record belies your very low value and frankly cowardly comment. If you have something to say at least do it under your real username instead of a throwaway.
The whole "reproduces training data vebatim" is a red herring.
It reproduces _patterns from the training data_, sometimes including verbatim phrases.
The work (to discover those patterns, to figure out what works and what does not, to debug some obscure heisenbug and write a blog post about it, ...) was done by humans. Those humans should be compensated for their work, not owners of mega-corporations who found a loophole in copyright.
Pick up a book about programming from seventies or eighties that was unlikely to be scanned and feed into LLM. Take a task from it and ask LLM to write a program from it that even a student can solve within 10 minutes. If the problem was not really published before, LLM fails spectacularly.
This does not appear to be true. Six months ago I created a small programming language. I had LLMs write hundreds of small programs in the language, using the parser, interpreter, and my spec as a guide for the language. The vast majority of these programs were either very close or exactly what I wanted. No prior source existed for the programming language because I created it whole cloth days earlier.
Obviously you accidentally recreated a language from the 70s :P
(I created a template language for JSON and added branching and conditionals and realized I had a whole programming language. Really proud of my originality until i was reading Ted Nelson's Computer Lib/Dream Machines and found out I reinvented TRAC, and to some extent, XSLT. Anyway LLMs are very good at reasoning about it because it can be constrained by a JSON schema. People who think LLMs only regurgitate haven't given it a fair shot)
I think the key is that the LLM is having no trouble mapping from one "embedding" of the language to another (the task they are best performers at!), and that appears extremely intelligent to us humans, but certainly is not all there's to intelligence.
But just take a look at how LLMs struggle to handle dynamical, complex systems such as the "vending machine" paper published some time ago. Those kind of tasks, which we humans tend to think of as "less intelligent" than say, converting human language to a C++ implementation, seem to have some kind of higher (or at least, different) complexity than the embedding mapping done by LLMs. Maybe that's what we typically refer to as creativity? And if so, modern LLMs certainly struggle with that!
Quite sci-fi that we have created a "mind" so alien we struggle to even agree on the word to define what it's doing :)
It's telling that you can't actually provide a single concrete example - because, of course, anyone skilled with LLMs would be able to trivially solve any such example within 10 minutes.
Perhaps the occasional program that relies heavily on precise visual alignment will fail - but I dare say if we give the LLM the same grace we'd give a visually impaired designer, it can do exactly as well.
I recently asked an LLM to give me one of the most basic and well-documented algorithms in the world: a blocked matrix multiply. It's essentially a few nested loops and some constants for the block size.
It failed massively, spitting out garbage code, where the comments claimed to use blocking access patterns, but the code did not actually use them at all.
LLMs are, frankly, nearly useless for programming. They may solve a problem every once in a while, but once you look at the code, you notice it's either directly plagiarized or bad quality (or both, I suppose, in the latter case).
I have a very anecdotal, but interesting, counterexample.
I recently asked Gemini 3 Pro to create an RSS feed reader type of experience by using XSLT to style and layout an OPML file. I specifically wanted it to use a server-side proxy for CORS, pass through caching headers in the proxy to leverage standard HTTP caching, and I needed all feed entries for any feed in the OPML to be combined into a single chronological feed.
It initially told multiple times that it wasn't possible (it also reminded me that Google is getting rid of XSLT). Regardless, after reiterating that it is possible multiple times it finally decided to make a temporary POC. That POC worked on the first try, with only one follow up to standardize date formatting with support for Atom and RSS.
I obviously can't say the code was novel, though I would be a bit surprised if it trained on that task enough for it to remember roughly the full implementation and still claimed it was impossible.
Why do you believe that to be a counterexample? In fragmentary form all of these elements must have been present in the input, the question is really how large the largest re-usable fragment was and whether or not barring some transformations you could trace it back to the original. I've done some experiments along the same lines to see what it spits out and what I noticed is that from example to example the programming style changed drastically, to the point that I suspect that it was mimicking even the style and not just the substance of the input data, and this over chunks of code long enough that it would definitely clear the bar for plagiarism.
> In fragmentary form all of these elements must have been present in the input
Yes, and Shakespeare merely copied the existing 26 letters of the English alphabet. What magical process do you think students are using when they read and re-combine learned examples to solve assignments?
This same argument has now been made a couple of times in this thread (in different guises) and does absolutely nothing to move the conversation forward.
Words and letters are not copyrightable patterns in and of themselves. It is the composition of words and letters that we consider to be original creations and 'the bard' put them in a meaningful and original order not seen before, which established his reputation as a playwright.
LLMs can definitely produce original other stuff: ask it to create an original poem and on an extremely specific niche subject and it will do so. You can specify the niche subject to the point where it is incredibly unlikely that there is a poem on that subject in its training data, and it will still produce an original poem on that subject [0]. The well-known "otter using wifi on a plane" series of images [1] is another example: this is not in the training data (well, it is now, because well-known, but you get the idea).
Is there something unique about code, that is different from language (or images), that would make it impossible for an LLM to produce original code? I don't believe so, but I'm willing to be convinced.
I think this switches the burden of proof: we know LLMs can produce original content in other contexts. Why would they not be able to create original code?
[0] Ever curious, I tested this assumption. I got Claude to write an original limerick about goats oiling their beards with olive oil, which was the first reasonable thing I could think of as a suitably niche subject. I googled the result and could not find anything close to it. I then asked it to produce another limerick on the same subject, and it produced a different limerick, so obviously not just repeating training data.
No, it transformed your prompt. Another person giving it the same prompt will get the same result when starting from the same state. f('your prompt here') is a transformation of your prompt based on hidden state.
This just reeks of a lack of understanding of how transformers work. Unlike Markov Chains that can only regurgitate known sequences, transformers can actually make new combinations.
What's your proof that the average college student can produce original code? I'm reasonably certain I can get an LLM to write something that will pass any test that the average college student can, as far as that goes.
I'm not asking about averages. I'm asking about any. There is no need to perform an academic research study to prove that humans are capable of writing original code because the existence of our conversation right now is the counter-example to disprove the negation.
Yes, it is true that a lot of humans remix existing code. But not all. It has yet to be proven that any LLM is doing something more than remixing code.
I would submit as evidence to this idea (LLMs are not capable of writing original code) the fact that not a single company using LLM-based AI coding has developed a novel product that has outpaced its competition. In any category. If AI really makes people "10x" more productive, then companies that adopted AI a year ago should be 10 years ahead of their competition. Substitute any value N > 1 you want and you won't see it. Indeed, given the stories we're seeing of the massive amounts of waste that is occurring within AI startups and companies adopting AI, it would suggest that N < 1.
Recently had a conversation with a coworker, which thought temporal was too complex for a first implementation. However after looking at the documentation it seems that the tech is very approachable. What makes using temporal complex ?
Perhaps he found the amount of infra required to use it substantial. You can run an async task queue on SQLite with a loop in any language. Temporal has an app sdk etc etc.
I do not agree that it is complex but that’s what I’d hypothesize as why someone would think that.
reply