More

tacoooooooo · 2026-01-14T19:19:46 1768418386

tacoooooooo · 2026-01-12T19:44:15 1768247055

This is an interesting read, and while I support being nice to every_thing_ in principle. Most of the research into this actually shows that being mean yeilds better results

bubblegumcrisis · 2026-01-12T20:27:32 1768249652

I've read the blurb from previous years about doing one-shots with threats of death, or etc - but I've never seen that for long many prompt sessions.

I wonder - if you hired a programmer for a day, trapped them in a cage, and then threatened them, maybe it would be more productive for a while. I mean, if I were writing that book, I could see how they would do great work for a bit.

tacoooooooo · 2026-01-12T19:37:57 1768246677

This looks pretty cool. I keep seeing people (an am myself) using claude code for more an more _non-dev_ work. Managing different aspects of life, work, etc. Anthropic has built the best harness right now. Building out the UI makes sense to get genpop adoption

ai-christianson · 2026-01-12T20:16:02 1768248962

Yeah, the harness quality matters a lot. We're seeing the same pattern at Gobii - started building browser-native agents and quickly realized most of the interesting workflows aren't "code this feature" but "navigate this nightmare enterprise SaaS and do the thing I actually need done." The gap between what devs use Claude Code for vs. what everyone else needs is mostly just the interface.

tacoooooooo · 2026-01-10T16:55:26 1768064126

fizsh sounds really cool, but the last commit was 7+ years ago. do you run into any issues? https://github.com/zsh-users/fizsh

tmsbrg · 2026-01-10T20:14:41 1768076081

I never noticed any issues, actually. I guess the zsh base is solid and stable.

tacoooooooo · 2026-01-09T14:44:09 1767969849

not sure there are any models yet that you can get the quality out you need to do this and run on your mbp

tacoooooooo · 2026-01-08T15:37:07 1767886627

This is a wildly out of touch thing to say

fourside · 2026-01-08T15:38:31 1767886711

Did you read the article?

dhorthy · 2026-01-08T15:51:02 1767887462

I read it. i agree this is out of touch. Not because the things its saying are wrong, but because the things its saying have been true for almost a year now. They are not "getting worse" they "have been bad". I am staggered to find this article qualifies as "news".

If you're going to write about something that's been true and discussed widely online for a year+, at least have the awareness/integrity to not brand it as "this new thing is happening".

flumpcakes · 2026-01-08T15:59:31 1767887971

Perhaps the advertising money from the big AI money sinks is running out and we are finally seeing more AI scepticism articles.

minimaxir · 2026-01-08T16:00:52 1767888052

> They are not "getting worse" they "have been bad".

The agents available in January 2025 were much much worse than the agents available in November 2025.

Snuggly73 · 2026-01-08T16:16:17 1767888977

Yes, and for some cases no.

The models are gotten very good, but I rather have an obviously broken pile of crap that I can spot immediately, than something that is deep fried with RL to always succeed, but has subtle problems that someone will lgtm :( I guess its not much different with human written code, but the models seem to have weirdly inhuman failures - like, you would just skim some code, cause you just cant believe that anyone can do it wrong, and it turns out to be.

minimaxir · 2026-01-08T16:18:33 1767889113

That's what test cases are for, which is good for both humans and nonhumans.

Snuggly73 · 2026-01-08T16:26:16 1767889576

Test cases are great, but not a total solution. Can you write a test case for the add_numbers(a, b) function?

Snuggly73 · 2026-01-08T16:42:25 1767890545

Well, for some reason it doesnt let me respond to the child comments :(

The problem (which should be obvious) is that with a/b real you cant construct an exhaustive input/output set. The test case can just prove the presence of a bug, but not its absence.

Another category of problems that you cant just test and have to prove is concurrency problems.

And so forth and so on.

minimaxir · 2026-01-08T16:34:34 1767890074

Of course you can. You can write test cases for anything.

Even an add_numbers function can have bugs, e.g. you have to ensure the inputs are numbers. Most coding agents would catch this in loosely-typed languages.

Snuggly73 · 2026-01-08T16:32:46 1767889966

I mean "have been bad" doesnt exclude "getting worse" right :)

tacoooooooo · 2025-12-30T17:45:05 1767116705

correct afaik :(

https://github.com/timescale/pgvectorscale/issues/113

tacoooooooo · 2025-12-30T17:43:07 1767116587

the main issue with pgvectorscale is that it's not available in RDS :(

omg2864 · 2025-12-30T18:27:18 1767119238

Yes, RDS seems to really hold PG back on AWS, with all the interesting pg extensions getting released now (pg_lake). It is a share I can't move to other PG vendors because it is a pain in the ass to get all privacy, legal docs in order.

coredog64 · 2025-12-31T17:38:23 1767202703

Technically, is there a reason AWS can't support allowing sophisticated users to run arbitrary extensions in RDS? The control-plane/data-plane boundaries should be robust enough that it's not going to allow an RDS extension to "hack AWS". Worst case is that AWS would have to account for the possibility of a crash backoff loop in RDS.

I understand that practically you can b0rk an install with a bunch of poorly configured extensions, and you can easily install something that hoovers up all your data and sends it to North Korea. But if I understand those risks and can mitigate them, why not allow RDS to load up extension binaries from an S3 bucket and call it a day?

If AWS wanted to broaden the available market, this would be an opportunity to leverage partners and the AWS marketplace mechanisms: Instead of AWS vouching for the extensions, allow partners to sell support in a marketplace. AWS has clean hands for the "My RDS instance crashed and wiped out my market cap" risk, but they can still wet their beak on the money flowing through to vendors. Meanwhile, vendors don't have to take full responsibility for the entire stack and mess with PrivateLink etc. Top tier vendors would also perform all the SOC attestation so that RDS doesn't lose out.

P.S. Andy, if you're reading this you should call me.

calderwoodra · 2025-12-30T20:13:36 1767125616

Yes, the InfoSec advantages of using RDS are very real, especially in B2B Enterprise SaaS.

mrinterweb · 2025-12-30T20:29:42 1767126582

I'm considering hosting a separate pg db just to be able to access certain extensions. I am interested in this extension as well as https://wiki.postgresql.org/wiki/Incremental_View_Maintenanc... (also not available on RDS). Then use logical replication for specific data source tables (guess it would need to be DMS).

tacoooooooo · 2025-12-30T17:37:09 1767116229

This probably doesn't count as an "app" in terms of what you're looking for, but was a fun little project

https://alexjacobs08.github.io/lobsters-graph/

(i built this in search of a lobste.rs invite if anyone willing and able sees this--email in my bio :)

rabf · 2025-12-30T18:32:03 1767119523

Nice site, runs very smoothly on Firefox.

tacoooooooo · 2025-12-29T16:18:34 1767025114

This looks awesome. ai-sdk is an excellent library. excited to see it proliferating

ishaksebsib · 2025-12-29T17:02:19 1767027739

Thanks!