You are wrong. There has been many reproductions. People don't study it because there is no known mechanism of action and so it's fringe.
Jessica Utts, a well respected statistician
> Despite Professor Hyman's continued protests about parapsychology lacking repeatability, I have never seen a skeptic attempt to perform an experiment with enough trials to even come close to insuring success. The parapsychologists who have recently been willing to take on this challenge have indeed found success in their experiments, as described in my original report.
Before you can define statistical significance, you have to clearly define the success criteria. From what I see, remote viewing produces vague results, so some amount of human interpretation is necessarily. What counts as a "hit"? If you look at "verified" examples from the social-rv site GP mentioned, some of them match only in an abstract sense, but are still counted as a success. The more reliable thing would be to remote view a coin flip and have the person say heads or tails, but that's not how the stargate experiments were defined and I haven't been able to find any trials like this.
Edit: Actually I did find at least one experiment-ish, which is more precognition rather than remote viewing to determine crypto coin price trends [1]. Seems 53 correct predictions, 50 incorrect predictions which is well within statistical chance.
Also seems the social-rv GP linked will eventually have a remote-viewing for real-world events prediction-market type thing. Now that's interesting, and they cleverly avoid it devolving into a traditional prediction market by introducing indirection where two images are arbitrarily assigned to the outcome (true/false) and the person RVs the image, without knowledge of which outcome that image represents.
No, she isn't. She's a statistician, but mostly known for being in the panel review of Star Gate, and for close associations with parapsychology organizations.
She was already involved in parapsychology, having coauthored papers with the director of Star Gate (a parapsychologist himself) before becoming part of the review panel! You cannot have vested interests in the phenomenon being real if you're going to judge it impartially. You cannot have a relationship with one of the key personnel in the project you're reviewing, and especially not a relationship specifically about the same kind of things you're supposed to review! This is a serious flaw, she shouldn't have been part of the panel.
> There has been many reproductions
Like which ones? A reproduction must be done independently, by scientists without the same sponsors and vested interested. Can you point to these reproductions?
By the way, Star Gate was canceled with the conclusion that the experiments were inconclusive. Had there been reproductions, surely the conclusions would have been different?
If you consider the extent to which our economy has become financialized, then you see these decisions have little to do with providing a product for customers but rather a stock for investors.
Bari wisely points out that if the deportees are being tortured, then there must be a secretly good reason why if they dig a little deeper. Suggests asking Stephen Miller.
General purpose LLMs aren't very good with generating bounding boxes, so with that context, this is actually seen as decent performance for certain use cases.
Yeah, that's bothered me as well. Andrej Karpathy does this all the time when he talks about the human brain and making analogies to LLMs. He makes speculative statements about how the human brain works as though it's established fact.
Andrej does use biological examples, but he's a lot more cautious about biomimicry, and often uses biological examples to show why AI and bio are different. Like he doesn't believe that animals use classical RL because a baby horse can walk after 5 minutes which definitely wasn't achieved through classical RL. He doesn't pretend to know how a horse developed that ability, just that it's not classical RL.
A lot of Ilya's takes in this interview felt like more of a stretch. The emotions and LLM argument felt like of like "let's add feathers to planes because birds fly and have feathers". I bet continual learning is going to have some kind of internal goal beyond RL eval functions, but these speculations about emotions just feel like college dorm discussions.
The thing that made Ilya such an innovator (the elegant focus on next token prediction) was so simple, and I feel like his next big take is going to be something about neuron architecture (something he eluded to in the interview but flat out refused to talk about).
I also tried that in the past with poor results. I just tried it this morning with nano banana pro and it nailed it with a very short prompt: "Repaint the house white with black trim. Do not paint over brick."
Fun vacuum tube history fact: the humble vacuum tube actually traces its origins back to Edison’s incandescent light bulbs. Those early bulbs would mysteriously blacken over time, and for years nobody could figure out why. It wasn’t until 1904 that John Ambrose Fleming connected the dots — the darkening came from metal burned off the filament, and in studying it, he created the first true vacuum tube. So the vacuum tube, the heart of early electronics, was born from the same simple light bulb that first lit our homes.
I noticed the tags could be improved by implementing some kind of standardization. You may be interested in this post on HN from a month ago: https://news.ycombinator.com/item?id=45571423. The comments also have some useful recommendations.
I just finished working on consulting project that involved tagging corrective and preventative actions (CAPAs) for a lab to help them organize some of their QA efforts. The use of LLMs for tagging free-form text is a common task and I thought it would be fun to experiment with different strategies for improving the consistency of tags. The article above presents a good approach because it's a streaming solution, but does come with drawbacks (more overhead to set up and treats older data differently than new data). Commenters recommend using a batch approach by collecting all the text up front and then using various strategies to cluster and generate tags, then use an LLM--giving it the predefined tags in its prompt. Once you have enough good tags you could train your own smaller model to generate tags. The batch methods have lower overhead, but take more time for tweaking and experimenting for your specific dataset.
For generating embeddings, I used Cohere's v4 embedder. I found that using HDBSCAN for clustering the embeddings of the tags was much more helpful than using K-means. I also learned that training a Pytorch MLP to predict multiple tags was superior in every aspect to training a tree-based model and gives very good precision, but just OK recall due to the difficulty of nailing all the tags right. I also compared gpt-5-mini and claude-haiku-4.5 for generating tags. Gpt-5-mini was much slower, but cheaper and better at generating good tags. Claude-haiku-4.5 was not far behind and was much faster due to the absence of thinking tokens, but much more expensive. The metric I used to compare the LMMs on their raw tagging ability was Scikit-learns's homogeneity_score.
> "a ceiling on rents reduces the quality and quantity of housing"
To address quality first, most economists would agree that landlords are incentivized to invest the bare minimum into their property that they can; this is not so much a function of income from rent. If a tenant feels generous and starts paying more for rent, the landlord will not invest more into their unit. So I find the inverse of that to be an assumption that doesn't completely add up.
Saying rent control will affect quantity is completely beside the point. Rent controls are meant to ease the financial burden on the people currently renting in NYC, not a hypothetical newcomer looking for an apartment. Housing is already a huge pain to find for lower-income new yorkers so the threat of a more scarcity doesn't really change the equation for a lot of people.
I think that goes without saying. The real question is what's the line between neutrality and letting a vocal minority dictate editorial decisions? Especially when the vocal minority has biased incentives towards making those changes.
reply