not my words but Don Knuth's from his lecture "Fun With Binary Decision Diagrams...

dang · on Dec 9, 2020

This refers to the submitted title ("BDD, the only fundamentally new data structure developed in the last 35 years") whic we've since reverted. If you want to add this sort of gloss to a submission, please do it in the comments. That way you're in keeping with the site guidelines (https://news.ycombinator.com/newsguidelines.html), and your addition will be on a level playing field with others'.

On another note: what are some amazing things you can do with it?

bra-ket · on Dec 9, 2020

I wouldn't do it justice[1], it deserves a blog post or a book [2] but in general solving almost any boolean algebra problem, in hundreds, and hundreds of thousands variables, very efficiently and in very compact representation (via ordered sparse bit vectors[3]).

While with Karnaugh maps you can simplify a boolean function with a few variables, BDDs can solve for thousands, fast.

And no need to explain applications of high-dimensional boolean algebra on HN, with boolean algebra you can solve any logic problem expressed as boolean function, any combinatorial problem (except for those with nasty functions), classic graph theory problems.

Surprisingly it allows to solve optimization problems[4], like boolean programming, SAT solver, max independent set or max cut in graphs, very efficiently, it can be used in something like belief propagation or lattice induction for inference, but if that's not enough you can use it for random number generation, lossless compression, perfect hashing, etc, etc.

I haven't seen such a versatile data structure elsewhere, most of the other things developed in the last 35 years, like the ones in the comment below are solving special cases, BDD is truly one of the most fundamental and severely underrated "swiss army knives" (that is in CS, EE people know it very well in logic synthesis and verification, BDD's first "killer app").

It's probably easier to list what you can't do with BDD, kind of like what you can't do with (high-dimensional) boolean logic.

I think it skipped the radar of CompSci community at large because it was too quickly siloed into "that circuit analysis/verification tool used by electrical engineers".

Yes, it's "just" a DAG but with very particular (and very simple) constraints which allow it to solve infinite variety of problems in a very elegant and surprising way [5].

[1] it's really worth watching the lecture on BDDs by Don Knuth , starting around 13:32, his enthusiasm is contagious: https://www.youtube.com/watch?v=SQE21efsf7Y&t=13m32s

Part 2 on ZDD: https://www.youtube.com/watch?v=-HzQYeqS9Wc

[2] TAOCP, volume 4A, Combinatorial Algorithms, p.202 - 280: https://www.amazon.com/Art-Computer-Programming-Combinatoria...

There is a free preprint here https://www-cs-faculty.stanford.edu/~knuth/fasc1b.ps.gz

[3] https://en.wikipedia.org/wiki/Zero-suppressed_decision_diagr...

[4] Bergman, David, et al. "Discrete optimization with decision diagrams." INFORMS Journal on Computing 28.1 (2016): 47-66.

[5] Bryant, Randal E. "Graph-based algorithms for boolean function manipulation." Computers, IEEE Transactions on 100.8 (1986): 677-691.

I hope that answers your question, dang, and sorry for title mishap.

jan_Inkepa · on Dec 10, 2020

Just watching through the Knuth lecture now. 40 minutes in (~54m mark) and I learned something new (to me) and useful and cool (randomly picking maximal independent sets of graphs). And I'm only half way in. Very nice; thanks for the recommendation :)

williamkuszmaul · on Dec 9, 2020

For perspective, here are some examples of other data structures in the past 35 years that are fundamentally new. (I am only including data structures that differ significantly from their predecessors)

B-epsilon trees: These allow asymptotic speedups for insert/update/delete operations on search trees in external memory (Introduced in 2002 by the paper, Lower Bounds for External Memory Dictionaries)

Cache-Oblivious B-Trees: This is an external-memory search tree that exhibits optimal behavior on a cache with any (possibly unknown) cache-size and cache-line size parameters. (Introduced in 2000 by the paper Cache-Oblivious B trees)

Fusion trees: This allows for search operations in a small binary tree (i.e., a tree whose size is polynomial in the machine word size) to be performed in constant time, rather than logarithmic time. (Introduced in 1990 by the paper Blasting through the Information Theoretic Barrier with Fusion Trees.)

Cuckoo hashing: This is a hash table design introduced in 2001. In 2009, the paper De-amortized Cuckoo Hashing showed how to make all operations in Cuckoo hashing take truly constant time with high probability. This remains (as far as I know) the only known technique for guaranteeing constant-time operation for a hash table without the use of bit manipulation tricks or the method of four Russians.

These are just examples off the top of my head. I'm sure there are many more.

js8 · on Dec 9, 2020

Also, I would say the whole concept of https://en.wikipedia.org/wiki/Succinct_data_structure is fundamentally new. (And looking at citations, they are not yet 35 years old.)

williamkuszmaul · on Dec 9, 2020

I agree. Related to this, sketches (i.e., small data structures that can be used to recover information about large data sets) are also fundamentally new (and most of the techniques are from the 21-st century).

iainmerrick · on Dec 9, 2020

Confluent/persistent data structures, which I believe was novel research by Chris Okasaki in the late 90s (at least that’s when I first heard of it -- I was in CS academia at the time).

Edit to add: I’m not a Clojure expert, but I believe you could trace a direct line to Clojure’s core immutable data structures from Okasaki’s doctoral thesis. But maybe I’m wrong! Would love to hear about another other takes on this or links to other research strands I’m not aware of.

psykotic · on Dec 9, 2020

The term 'persistent data structure' goes back to papers by Tarjan, et al in the 80s but the path copying technique is much older. My favorite technique from one of those papers, the in-place v2 trick, is an alternative to path copying that achieves amortized O(1) space per update as opposed to O(depth) space per update for path copying; this only gives you partial persistence (linear history like an MVCC database system, not branching history like a purely functional data structure or Git) but often that's all you need. (There's also an extension of the trick to full persistence but it's not very practical.)

> I’m not a Clojure expert, but I believe you could trace a direct line to Clojure’s core immutable data structures from Okasaki’s doctoral thesis.

I believe the direct inspiration for Clojure's persistent vectors and maps are Bagwell's papers on HAMTs, which weren't so much a new data structure as an example of data structure engineering. Okasaki's thesis has several contributions but its main theme is how you can apply amortization in the purely functional setting if you have laziness or memoization, which is in a very different direction.

iainmerrick · on Dec 9, 2020

Thank you! So I was looking at it from a purely functional point of view, without accounting for the earlier non-pure work in the literature. That makes sense; I was deep in the early Haskell world at the time.

froh · on Dec 9, 2020

Suffix Arrays: with their meanwhile linear time construction they allow for very efficient pattern matching, indexing and more efficient compression algorithms. (introduced around 1990 https://en.wikipedia.org/wiki/Suffix_array)

psykotic · on Dec 9, 2020

It's an important theoretical result, but the optimized library almost everyone uses (libdivsufsort) is based on an O(n log n) time algorithm; I haven't seen any competitive implementations of the linear-time algorithms.

williamkuszmaul · on Dec 9, 2020

Another example is locality sensitive hashing (introduced in late 1990s), which can be used as a data structure to efficiently solve the nearest-neighbor problem.

ajfjrbfbf · on Dec 9, 2020

There's Judy arrays too, I'm not sure if it's already included in one of those data structures.

dmoldavanov · on Dec 9, 2020

Cache-Oblivious B-Tree is a variant of B-Tree The title says "fundamentally new"

williamkuszmaul · on Dec 9, 2020

No, cache-oblivious B-trees are not a variant of B-trees (although I can see why you might think the name suggests that). I classify them as fundamentally new because they technique they introduce was fundamentally different than past data structures. (Their fractal-like layout allows for them to achieve optimal behavior on an arbitrarily parameterized cache.)

root_axis · on Dec 9, 2020

You remember the dates of when these data structures were first published off the top of your head? Are you a computer science professor?

boyobo · on Dec 9, 2020

A quick google shows that the OP is indeed in academia.

root_axis · on Dec 9, 2020

Oh. Well that's pretty impressive. I typically don't look up people's usernames on the site, but I guess I should more often.