Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
pegasus
61 days ago
|
parent
|
context
|
favorite
| on:
Reasoning models reason well, until they don't
The bigger problem is that the benchmarks / multiple-choice tests they are trained to optimize for don't distinguish between a wrong answer and "I don't know". Which is stupid and surprising. There was a thread here on HN about this recently.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: