AI could help make Wikipedia entries more accurate

jka · on July 11, 2022

Trying to ignore any hype, lofty sci-fi ideas, or potential philosophical questions for a moment: roughly speaking, it sounds like this is a search engine, for use in a neat and thought-provoking use case.

There's an architecture diagram[1] alongside the source code, and my summary would be:

- The system has in-house web indexes built from Common Crawl[2] data

- The system receives snippets of text from Wikipedia and determines whether existing citations exist and whether they are valid

- If no valid citation exists, then the system performs queries against the indexes to find relevant URLs

It'd be interesting to learn how this approach fares compared to pasting the relevant paragraphs of text into search engines and excluding site:wikipedia.org from the results.

Something about feedback loops and data quality makes me wary that too much application of automated systems like this would lead to a degradation of content quality (each updated copy an imperfect translation or reference to an existing one).

[1] - https://github.com/facebookresearch/side/tree/a595fb09c85233...

[2] - https://commoncrawl.org/

macrolocal · on July 11, 2022

Nb. http://xowa.org/ is one way to archive Wikipedia.

permo-w · on July 11, 2022

I object to this

juve1996 · on July 11, 2022

"could"

Tell me when you "can."

camdat · on July 11, 2022

Did you bother to read the article?

>Building on Meta AI’s research and advancements, we’ve developed the first model capable of automatically scanning hundreds of thousands of citations at once to check whether they truly support the corresponding claims.

adhesive_wombat · on July 11, 2022

> check whether they truly support

Check whether they statistically _might_ support, more like.

No AI can tell you if an article does or doesn't support something with complete confidence.

yesenadam · on July 12, 2022

Please don't comment on whether someone read an article. "Did you even read the article? It mentions that" can be shortened to "The article mentions that."

https://news.ycombinator.com/newsguidelines.html

sdfhdhjdw3 · on July 11, 2022

FB is a cancer, lets keep it far away from wikipedia.