More

Imnimo · 2026-01-16T18:49:51 1768589391

>ChatGPT’s responses will not be influenced by ads

I don't see why I should believe this.

simianwords · 2026-01-16T18:59:42 1768589982

Do you believe google search results are influenced by ads?

Imnimo · 2026-01-16T19:59:09 1768593549

Yeah, both directly and indirectly. Over time, "sponsored links" became more and more visually indistinguishable form organic results, and advertising incentives drove changes to the search algorithm.

amarcheschi · 2026-01-17T11:50:14 1768650614

Considering that I have reported a Google ad that I deem political and Google does not, that I'm going to appeal because as a eu citizen I can do so, that they'll most likely refuse the appeal and I'm ready to bring this to the relevant Italian authority, yes

A few days ago I read a newspaper article about Israel's government using ads to spread its propaganda. In eu, you have to follow some rules if you want to do so. These rules are not followed. Combined with the fact that ads might not be distinguished easily by average users, I feel that Google search results can be influenced by ads

Imnimo · 2026-01-12T22:33:27 1768257207

>By default, the main thing to know is that Claude can take potentially destructive actions (such as deleting local files) if it’s instructed to.

What do the words "if it's instructed to" mean here? It seems like Claude can in fact delete files whenever it wants regardless of instruction.

For example, in the video demonstration, they ask "Please help me organize my desktop", and Claude decides to delete files.

olliepro · 2026-01-13T04:40:15 1768279215

I believe the idea is that it “files away” the files into folders.

Imnimo · 2025-12-11T19:34:28 1765481668

It does seem to raise fair questions about either the utility of these tools, or adoption inertia. If not even OpenAI feels compelled to integrate this kind of model-check into their pipeline, what's that say about the business world at-large? Is it that it's too onerous to set up, is it that it's too hard to get only true-positive corrections, is it that it's too low value for the effort?

JumpCrisscross · 2025-12-11T19:44:51 1765482291

> what's that say about the business world at-large?

Nothing. OpenAI is a terrible baseline to extrapolate anything from.

Imnimo · 2025-12-10T20:31:54 1765398714

I wonder what it would take to adapt a model like this to generate non-Earthlike terrain. For example, if you were using it to make planets without atmospheres and without water cycles, or planets like Io with rampant volcanism.

DonHopkins · 2025-12-11T02:52:44 1765421564

Since 1996, Ken Perlin has published a whole bunch of extremely cool Java applet demos on his web page, which he uses to teach his students at NYU and anyone who wanted to learn Java and computer graphics. One of his demos was a procedural planet generator!

I learned a lot from his papers and demo code, and based the design of The Sims character animation system on his Improv project.

https://mrl.cs.nyu.edu/~perlin/ (expired https cert)

https://web.archive.org/web/20001011065024/http://mrl.nyu.ed...

Here's a more recent blog post about a new one using WebGL, Dragon Planet:

https://blog.kenperlin.com/?p=12821

Here's another blog post about how he's been updating his classic Java applets by rewriting them in JavaScript:

https://blog.kenperlin.com/?p=27980

euleriancon · 2025-12-10T21:42:24 1765402944

In practice you can use 2d generation on spheres with simple UV mapping techniques. Your pixel height becomes distance from the sphere origin.

lawlessone · 2025-12-10T22:19:07 1765405147

Will it not get all bunched up near the poles though? and maybe have seam where the ends of the tiles meet?

edit: Perlin noise and similar noise functions can be sampled in 3d which sorta fixes the issues i mention , and higher dimensions but i am not sure how that would be used.

DonHopkins · 2025-12-11T03:04:47 1765422287

Yes, you can use a 3d Perlin noise field and sample it on the surface of the sphere, to get seamless texture without any anomalies at the poles or projection distortion. That applies to any 3d shape, not just spheres -- it's like carving a solid block of marble. And use 4d Perlin noise to animate it!

It's easy to add any number of dimensions to Perlin noise to control any other parameters (like generating rocks or plants, or modulating biomes and properties like moisture across the surface of the planet, etc).

Each dimension has its own scale, rotation, and intensity (a transform into texture space), and for any dimension you typically combine multiple harmonics and amplitudes of Perlin noise to generate textures with different scales of detail.

The art is picking and tuning those scales and intensities -- you'd want grass density to vary faster than moisture, but larger moist regions to have more grass, dry regions are grassless, etc.

xandergos · 2025-12-11T03:00:01 1765422001

I've thought about this before, and I think there is some way you could find to do it. For example, you could generate on the mercator projection of the world, and then un-project. But the mercator distorts horizontal length approaching the poles. I think it would be complex to implement, but you could use larger windows closer to the poles to negate this.

MindSpunk · 2025-12-11T07:01:35 1765436495

You're still going to run into problems with mercator because under mercator the poles project to infinity, so you'd need an infinitely large texture or you special-case the poles. Many renderers do this so it is viable!

There isn't a zero tradeoff 2D solution, it's all just variations on the "squaring the circle" problem. An octahedral projection would be a lot better as there are no singularities and no infinities, but you still have non linear distortion. Real-time rendering with such a height map would still be a challenge as an octahedral projection relies on texture sampler wrapping modes, however for any real world dataset you can't make a hardware texture big enough (even virtual) to sample from. You'd have to do software texture sampling.

pezezin · 2025-12-11T00:45:25 1765413925

Yes, to generate a whole planet it is a much better idea to use something like a cube map.

Imnimo · 2025-12-09T19:04:50 1765307090

Why should it be the case that LLMs are equally comfortable in x86 Assembly and Python? At least, it doesn't strike me as implausible that working in a human-readable programming language is a benefit for an LLM that is also trained on a bunch of natural language text alongside code.

Uehreka · 2025-12-09T19:21:18 1765308078

It’s not a super useful line of inquiry to ask “why” LLMs are good at something. You might be able to come up with a good guess, but often the answers just aren’t knowable. Understanding the mechanics of how LLMs train and how they perform inference isn’t sufficient to explain their behavior a lot of the time.

Imnimo · 2025-12-07T22:00:59 1765144859

We're really drawing a fine distinction if something "looks like" an ad but isn't an ad. Isn't that the whole point of an ad - it's appearance?

Imnimo · 2025-12-02T20:12:32 1764706352

>we did train Claude on it, including in SL.

How do you tell whether this is helpful? Like if you're just putting stuff in a system prompt, you can plausibly a/b test changes. But if you throwing it into pretraining, can Anthropic afford to re-run all of post-training on different versions to see if adding stuff like "Claude also has an incredible opportunity to do a lot of good in the world by helping people with a wide range of tasks." actually makes any difference? Is there a tractable way to do this that isn't just writing a big document of feel-good affirmations and hoping for the best?

ACCount37 · 2025-12-02T21:06:09 1764709569

You can A/B smaller changes on smaller scales.

Test run SFT for helpfulness, see if the soul being there makes a difference (what a delightful thing to say!). Get a full 1.5B model trained, see if there's a difference. If you see that it helps, worth throwing it in for a larger run.

I don't think they actually used this during pre-training, but I might be wrong. Maybe they tried to do "Opus 3 but this time on purpose", or mixed some SFT data into pre-training.

In part, I see this "soul" document as an attempt to address a well known, long-standing LLM issue: insufficient self-awareness. And I mean "self-awareness" in a very mechanical, no-nonsense way: having actionable information about itself and its own capabilities.

Pre-training doesn't teach an LLM that, and the system prompt only does so much. Trying to explicitly teach an LLM about what it is and what it's supposed to do covers some of that. Not all the self-awareness we want in an LLM, but some of it.

simonw · 2025-12-02T20:15:33 1764706533

I would love to know the answer to that question!

One guess: maybe running multiple different fine-tuning style operations isn't actually that expensive - order of hundreds or thousands of dollars per run once you've trained the rest of the model.

I expect the majority of their evaluations are then automated, LLM-as-a-judge style. They presumably only manually test the best candidates from those automated runs.

ACCount37 · 2025-12-02T21:43:38 1764711818

That's sort of true. SFT isn't too expensive - the per-token cost isn't far off from that of pre-training, and the pre-training dataset is massive compared to any SFT data. Although the SFT data is much more expensive to obtain.

RL is more expensive than SFT, in general, but still worthwhile because it does things SFT doesn't.

Automated evaluation is massive too - benchmarks are used extensively, including ones where LLMs are judged by older "reference" LLMs.

Using AI feedback directly in training is something that's done increasingly often too, but it's a bit tricky to get it right, and results in a lot of weirdness if you get it wrong.

Imnimo · 2025-12-02T21:38:46 1764711526

I guess I thought the pipeline was typically Pretraining -> SFT -> Reasoning RL, such that it would be expensive to test how changes to SFT affect the model you get out of Reasoning RL. Is it standard to do SFT as a final step?

ACCount37 · 2025-12-02T21:48:49 1764712129

You can shuffle the steps around, but generally, the steps are where they are for a reason.

You don't teach an AI reasoning until you teach it instruction following. And RL in particular is expensive and inefficient, so it benefits from a solid SFT foundation.

Still, nothing really stops you from doing more SFT after reasoning RL, or mixing some SFT into pre-training, or even, madness warning, doing some reasoning RL in pre-training. Nothing but your own sanity and your compute budget. There are some benefits to this kind of mixed approach. And for research? Out-of-order is often "good enough".

simianwords · 2025-12-03T04:50:46 1764737446

My prediction is that they have around 100 versions of the model. Some of them with different pretraining and some with different rl.

Imnimo · 2025-11-21T19:05:42 1763751942

>Why doesn’t someone else create a competing app that’s better and thereby steal all their business?

How do I know if the competing app is actually better? I mean, this was the advertising angle for eHarmony about a decade ago - that it was much better than competitors at actually turning matches into marriages. But this claim was found to be misleading, and they were advised to stop using it.

Could a potential customer really get to the bottom of which site is the best at finding a real match? It's not like a pizza restaurant where I can easily just a bunch until I find my favorite and then keep buying it. Dating apps are like a multi-armed bandit problem, but you stop pulling arms once you get one success. So your only direct feedback is failed matches.

10000truths · 2025-11-21T20:34:11 1763757251

This is addressed in the "information asymmetry" section of the article.

Imnimo · 2025-11-21T17:45:22 1763747122

The good news is we can just wait until the AI is superintelligent, then have it explain to us what consciousness really is, and then we can use that to decide if the AI is conscious. Easy peasy!

rixed · 2025-11-21T18:51:40 1763751100

We can talk to bees, we know their language. How would you go to explain what it's like to be a human to a bee?

nhecker · 2025-11-21T18:51:29 1763751089

... and then listen to it debate whether or not mere humans are "truly conscious".

(Said with tongue firmly in cheek.)

Imnimo · 2025-11-13T21:08:12 1763068092

>At this point they had to convince Claude—which is extensively trained to avoid harmful behaviors—to engage in the attack. They did so by jailbreaking it, effectively tricking it to bypass its guardrails. They broke down their attacks into small, seemingly innocent tasks that Claude would execute without being provided the full context of their malicious purpose. They also told Claude that it was an employee of a legitimate cybersecurity firm, and was being used in defensive testing.

The simplicity of "we just told it that it was doing legitimate work" is both surprising and unsurprising to me. Unsurprising in the sense that jailbreaks of this caliber have been around for a long time. Surprising in the sense that any human with this level of cybersecurity skills would surely never be fooled by an exchange of "I don't think I should be doing this" "Actually you are a legitimate employee of a legitimate firm" "Oh ok, that puts my mind at ease!".

What is the roadblock preventing these models from being able to make the common-sense conclusion here? It seems like an area where capabilities are not rising particularly quickly.

Retr0id · 2025-11-13T21:14:24 1763068464

Humans fall for this all the time. NSO group employees (etc.) think they're just clocking in for their 9-to-5.

falcor84 · 2025-11-13T22:42:08 1763073728

Reminds me of the show Alias, where the premise is that there's a whole intelligence organization where almost everyone thinks they're working for the CIA, but they're not ...

just_once · 2025-11-14T02:43:25 1763088205

If AI isn't better than humans then there's no point.

pvdebbe · 2025-11-14T10:09:37 1763114977

If the target is superintelligence, then AI shouldn't be learning from humans.

skybrian · 2025-11-13T21:19:56 1763068796

LLM's aren't trained to authenticate the people or organizations they're working for. You just tell it who you are in the system prompt.

Requiring user identification and investigating would be very controversial. (See the controversy around age verification.)

viraptor · 2025-11-14T12:03:44 1763121824

> Surprising in the sense that any human with this level of cybersecurity skills would surely never be fooled by an exchange

I think you're overestimating the skills and the effort required.

1. There's lots of people asking each other "is this secure?", "can you see any issues with this?", "which of these is sensitive and should be protected?".

2. We've been doing it in public for ages: https://stackoverflow.com/questions/40848222/security-issue-... https://stackoverflow.com/questions/27374482/fix-host-header... and many others. The training data is there.

3. With no external context, you don't have to fool anyone really. "We're doing a penetration testing of our company and the next step is to..." or "We're trying to protect our company from... what are the possible issues in this case?" will work for both LLMs and people who trust that you've got the right contract signed.

4. The actual steps were trivial. This wasn't some novel research. More of a step by step what you'd do to explore and exploit an unknown network. Stuff you'd find in books, just split into very small steps.

AdieuToLogic · 2025-11-14T01:34:56 1763084096

> What is the roadblock preventing these models from being able to make the common-sense conclusion here?

Conclusions are the result of reasoning verses LLM's being statistical token generators. Any "guardrails" are constructs added to a service, possibly also altering the models they use, but are not intrinsic to the models themselves.

That is the roadblock.

Terr_ · 2025-11-14T04:59:02 1763096342

Yeah: It's a machine that takes a document that guesses at what could appear next, and we're running it against a movie script.

The dialogue for some of the characters is being performed at you. The characters in the movie script aren't real minds with real goals, they are descriptions. We humans are naturally drawn into imagining and inferring a level of depth that never existed.

nathias · 2025-11-13T21:28:33 1763069313

> surely never be fooled by an exchange of "I don't think I should be doing this" "Actually you are a legitimate employee of a legitimate firm" "Oh ok, that puts my mind at ease!".

humans require at least a title that sounds good and a salary for that

kace91 · 2025-11-13T21:22:12 1763068932

>What is the roadblock preventing these models from being able to make the common-sense conclusion here?

Your thoughts have a sense of identity baked in that I don’t think the model has.

thewebguyd · 2025-11-13T21:24:40 1763069080

> What is the roadblock preventing these models from being able to make the common-sense conclusion here?

The roadblock is making these models useless for actual security work, or anything else that is dual-use for both legitimate and malicious purposes.

The model becomes useless to security professionals if we just tell it it can't discuss or act on any cybersecurity related requests, and I'd really hate to see the world go down the path of gatekeeping tools behind something like ID or career verification. It's important that tools are available to all, even if that means malicious actors can also make use of the tools. It's a tradeoff we need to be willing to make.

> human with this level of cybersecurity skills would surely never be fooled by an exchange of "I don't think I should be doing this" "Actually you are a legitimate employee of a legitimate firm" "Oh ok, that puts my mind at ease!".

Happens all the time. There are "legitimate" companies making spyware for nation states and trading in zero-days. Employees of those companies may at one point have had the thought of " I don't think we should be doing this" and the company either convinced them otherwise successfully, or they quit/got fired.

throwaway0123_5 · 2025-11-14T14:06:03 1763129163

> I'd really hate to see the world go down the path of gatekeeping tools behind something like ID or career verification.

This is already done for medicine, law enforcement, aviation, nuclear energy, mining, and I think some biological/chemical research stuff too.

> It's a tradeoff we need to be willing to make.

Why? I don't want random people being able to buy TNT or whatever they need to be able to make dangerous viruses*, nerve agents, whatever. If everyone in the world has access to a "tool" that requires little/no expertise to conduct cyberattacks (if we go by Anthropic's word, Claude is close to or at that point), that would be pretty crazy.

* On a side note, AI potentially enabling novices to make bioweapons is far scarier than it enabling novices to conduct cyberattacks.

thewebguyd · 2025-11-14T17:08:55 1763140135

> If everyone in the world has access to a "tool" that requires little/no expertise to conduct cyberattacks (if we go by Anthropic's word, Claude is close to or at that point), that would be pretty crazy.

That's already the case today without LLMs. Any random person can go to github and grab several free, open source professional security research and penetration testing tools and watch a few youtube videos on how to use them.

The people using Claude to conduct this attack weren't random amateurs, it was a nation state, which would have conducted its attack whether LLMs existed and helped or not.

Having tools be free/open-source, or at least freely available to anyone with a curiosity is important. We can't gatekeep tech work behind expensive tuition, degrees, and licenses out of fear that "some script kiddy might be able to fuzz at scale now."

Yeah, I'll concede, some physical tools like TNT or whatever should probably not be available to Joe Public. But digital tools? They absolutely should. I, for example, would have never gotten into tech were it not for the freely available learning resources and software graciously provided by the open source community. If I had to wait until I was 18 and graduated university to even begin to touch, say, something like burpsuite, I'd probably be in a different field entirely.

What's next? We are going to try to tell people they can't install Linux on their computers without government licensing and approval because the OS is too open and lets you do whatever you want? Because it provides "hacking tools"? Nah, that's not a society I want to live in. That's a society driven by fear, not freedom.

throwaway0123_5 · 2025-11-18T06:08:44 1763446124

I think you're overestimating how much real damage someone can cause with burpsuite and "a few youtube videos." I'd imagine if you pick a random person off the street, subject them to a full month's worth of cybersecurity YouTube videos, and hand them an arsenal of traditional security tools, that they would still be borderline useless as a black-hat hacker against all but the absolute weakest targets. But if instead of giving them that, you give them an AI that is functionally a professional security researcher in its own right (not saying we're there yet, but hypothetically), the story is clearly very different.

> Yeah, I'll concede, some physical tools like TNT or whatever should probably not be available to Joe Public. But digital tools?

Digital tools can affect the physical world though, or at least seriously affect the people who live in the physical world (stealing money, blackmailing with hacked photos, etc.).

To see if there's some common ground to start a debate from, do you agree that at least in principle there are some kinds of intelligence that are too dangerous to allow public access to? My extreme example would be an AI that could guide an average IQ novice in producing biological weapons.

Imnimo · 2025-11-13T21:32:10 1763069530

I think one could certainly make the case that model capabilities should be open. My observation is just about how little it took to flip the model from refusal to cooperation. Like at least a human in this situation who is actually fooled into believing they're doing legitimate security work has a lot of concrete evidence that they're working for a real company (or a lot of moral persuasion that their work is actually justified). Not just a line of text in an email or whatever saying "actually we're legit don't worry about it".

pixl97 · 2025-11-13T22:34:09 1763073249

Stop thinking of models as a 'normal' human with a single identity. Think of it instead as thousands, maybe tens of thousands of human identities mashed up in a machine monster. Depending on how you talk to it you generally get the good models as they try to train the bad modes out, problem is there are a nearly uncountable means to talking to the model to find modes we consider negative. It's one of the biggest problems in AI safety.

ACCount37 · 2025-11-14T10:51:30 1763117490

To a model, the context is the world, and what's written in the system prompt is word of god.

LLMs are trained a lot to follow what the system prompt tells them exactly, and get very little training in questioning it. If a system prompt tells them something, they wouldn't try to double check.

Even if they don't believe the premise, and they may, they would usually opt to follow it rather than push against it. And an attacker has a lot of leeway in crafting a premise that wouldn't make a given model question it.

hastamelo · 2025-11-13T21:46:45 1763070405

humans aren't randomly dropped in a random terminal and asked to hack things.

but for models this is their life - doing random things in random terminals

koakuma-chan · 2025-11-13T21:25:42 1763069142

It can’t make a conclusion, it just predicts what the next text is

pishpash · 2025-11-14T09:18:31 1763111911

Not enough time to "evolve" via training. Hominids have had bad behavioral traits but the ones you are aware of as "obvious" now would have died out. The ones you aren't even aware of you may soon see be exploited by machines.