We built something very similar back in 2016, in the jvm with unsafe memory and ...

caudena · 2025-06-19T13:45:42 1750340742

Assuming by undoing you mean splitting the cluster:

A linked list can be split in two in O(1). When it comes to updating the roots for all the removed nodes, there is no easy way out, but luckily:

- This process can be parallelized.

- It could be done just once for multiple clustering changes.

- This is a multi-level disjoint set, not all the levels or sub-clusters are usually affected. Upper level clustering, which is based on lower confidence level, can be rebuilt more easily.

If by undoing you mean reverting the changes, we don’t use a persistent data structure. When we need historical clustering, we use a patched forest with concurrent hash maps to track the changes, and then apply or throw them away.

We use a single instance for all clients, but when one CFD server processes new block data, it becomes fully blocked for read access. To solve this, we built a smart load balancer that redirects user requests to a secondary CFD server. This ensures there's always at least two servers running, and more if we need additional throughput.