I've found that an effective tactic for larger, more complex tasks is to tell it...

hex4def6 · 2025-09-02T22:25:32 1756851932

FYI: You can force "Plan mode" by pressing shift-tab. That will prevent it from eagerly implementing stuff.

jaggederest · 2025-09-02T22:54:43 1756853683

> That will prevent it from eagerly implementing stuff.

In theory. In practice, it's not a very secure sandbox and Claude will happily go around updating files if you insist / the prompt is bad / it goes off on a tangent.

I really should just set up a completely sandboxed VM for it so that I don't care if it goes rm-rf happy.

adastra22 · 2025-09-02T23:26:15 1756855575

Plan mode disabled the tools, so I don’t see how it would do that.

A sandboxed devcontainer is worth setting up though. Lets me run it with —dangerously-skip-permissions

faangguyindia · 2025-09-03T01:16:44 1756862204

how can it plan if it does not have access to file read, search, bash tools to investigate things? If it has access to bash tools then it's going to write code, via echo or sed.

adastra22 · 2025-09-03T03:57:35 1756871855

It has file read, search, but not bash AFAIK.

jaggederest · 2025-09-02T23:38:08 1756856288

I don't know either but I've seen it write to files in plan mode. Very confusing.

faangguyindia · 2025-09-03T07:53:20 1756886000

It does not write anything in plan mode, it's documented here it has only readonly tools available in plan mode: https://docs.anthropic.com/en/docs/claude-code/common-workfl...

But here are fine prints, it has "exit plan mode" tool, documented here: https://minusx.ai/blog/decoding-claude-code/#appendix

So it can exit plan mode on its own and you wouldn't know!

jaggederest · 2025-09-04T01:07:43 1756948063

Ok, it's done it to me 3 times today, so I don't know what to tell you. I remind it that it's in plan mode and it goes "oh no I shouldn't have modified that file then!"

oxidant · 2025-09-03T01:25:27 1756862727

I've never seen it write a file in plan mode either.

EnPissant · 2025-09-03T01:19:29 1756862369

That's not possible. You are misremembering.

nomoreofthat · 2025-09-03T05:11:11 1756876271

It’s entirely possible. Claude’s security model for subagents/tasks is incoherent and buggy, far below the standard they set elsewhere in their product, and planning mode can use subagent/tasks for research.

Permission limitations on the root agent have, in many cases, not been propagated to child agents, and they’ve been able to execute different commands. The documentation is incomplete and unclear, and even to the extent that it is clear it has a different syntax with different limitations than are used to configure permissions for the root agent. When you ask Claude itself to generate agent configurations, as is recommended, it will generate permissions that do not exist anywhere in the documentation and may or may not be valid, but there’s no error admitted if an invalid permission is set. If you ask it to explain, it gets confused by their own documentation and tells you it doesn’t know why it did that. I’m not sure if it’s hallucinating or if the agent-generating-agent has access to internal detail details that are not documented anywhere in which the normal agent can’t see.

Anthropic is pretty consistently the best in this space in terms of security and product quality. They seem to actually care about doing software engineering properly. (I’ve personally discovered security bugs in several competing products that are more severe and exploitable than what I’m talking about here.) I have a ton of respect for Anthropic. Unfortunately, when it comes to sub agents in Claude code, they are not living up to standard they have set.

sshine · 2025-09-03T01:46:30 1756863990

I've seen it run commands that are naively assumed to be reading files or searching directories.

I.e. not its own tools, but command-line executables.

Its assumptions about these commands, and specifically the way it ran them, were correct.

But I have seen it run commands in plan mode.

laborcontract · 2025-09-03T06:01:21 1756879281

No, it is possible. I just got it to write files both using Bash and its Write tools while in plan mode right now.

jaggederest · 2025-09-04T01:08:36 1756948116

3 times today. I don't know what to say besides it tries to edit files in plan mode often for me

theshrike79 · 2025-09-04T06:21:30 1756966890

I've had it do it in plan mode.

Nothing dangerous, but the limits are more like suggestions, as the Pirate code says.

yahoozoo · 2025-09-02T23:55:55 1756857355

How does a token predictor “apply heuristics to score candidates”? Is it running a tool, such as a Python script it writes for scoring candidates? If not, isn’t it just pulling some statistically-likely “score” out of its weights rather than actually calculating one?

imtringued · 2025-09-03T13:35:28 1756906528

You can think of the K(=key) matrix in attention as a neural network where each token is turned into a tiny classifier network with multiple inputs and a single output.

The softmax activation function picks the most promising activations for a given output token.

The V(=value) matrix forms another neural network where each token is turned into a tiny regressor neural network that accepts the activation as an input and produces multiple outputs that are summed up to produce an intermediate token which is then fed into the MLP layer.

From this perspective the transformer architecture is building neural networks at runtime.

But there are some pretty obvious limitations here: The LLM operates on tokens, which means it can only operate on what is in the KV-cache/context window. If the candidates are not in the context window, it can't score them.

yahoozoo · 2025-09-03T21:38:07 1756935487

I’m not sure if I’m just misunderstanding or we are talking about two different things. I know at a high level how transformers/LLMs decide its next token in the response it is generating.

My question to the post I replied to was basically: given a coding problem, and a list of possible solutions (candidates), how can a LLM generate a meaningful numerical score for each candidate to then say this one is a better solution than that one?

astrange · 2025-09-03T02:07:31 1756865251

Token prediction is the interface. The implementation is a universal function approximator communicating through the token weights.