We built something very similar back in 2016, in the jvm with unsafe memory and garbage-free data structures to avoid GC pauses.
The dynamic clustering is not too hard, are you able to dynamically undo a cluster when new information shows up?
Are you running separate instances per customer to separate the information they have access to?
Assuming by undoing you mean splitting the cluster:
A linked list can be split in two in O(1). When it comes to updating the roots for all the removed nodes, there is no easy way out, but luckily:
- This process can be parallelized.
- It could be done just once for multiple clustering changes.
- This is a multi-level disjoint set, not all the levels or sub-clusters are usually affected. Upper level clustering, which is based on lower confidence level, can be rebuilt more easily.
If by undoing you mean reverting the changes, we don’t use a persistent data structure. When we need historical clustering, we use a patched forest with concurrent hash maps to track the changes, and then apply or throw them away.
We use a single instance for all clients, but when one CFD server processes new block data, it becomes fully blocked for read access. To solve this, we built a smart load balancer that redirects user requests to a secondary CFD server. This ensures there's always at least two servers running, and more if we need additional throughput.
Are you running separate instances per customer to separate the information they have access to?