I wonder if this segment is ready for disruption. Splunk is very expensive, Elas...

mikeshi42 · on Sept 21, 2023

We're[1] building the OSS equivalent when it comes to the observability side of Splunk/DD, on Clickhouse naturally of course but believe in the same end goal of lowering cost via separation of compute and storage.

[1] https://github.com/hyperdxio/hyperdx

cliffcrosland · on Sept 22, 2023

We’re also giving this a shot. The annual Splunk bill at our last startup exploded from $10k to $1M when we reached 1TB of logs generated per day, which is actually an easy threshold to hit when you have decent traction and aren’t proactively reducing logs. So we built Scanner.dev to drop these costs by 10x.

Decoupling compute and storage is definitely the way to go. We’re using Lambda functions and ECS Fargate containers for compute that scales up and down rapidly, and S3 for storage. Getting ~1TB/sec log scan speeds, which feels fairly good. We keep sparse indices in S3 to narrow down regions of logs to scan. Eg. if you’re searching for an IP address that appears 10 times in a 25TB log set, the indices reduce the search space to around 300MB. Takes a few seconds to complete that query, whereas Athena and CloudWatch take like 20 minutes.

We’re also using Rust to maximize memory efficiency and speed - there are lots of great SIMD optimized string search and regex libraries on crates.io.

We’re early, so there are a lot of SIEM features like detection rules that we are still building. But Splunk/DataDog users might find it useful if costs are a problem and they use mostly log search:

https://scanner.dev

dogman144 · on Sept 21, 2023

A stack we’ll see:

- panther siem (python alerts, thank the lord) and then pandas + databricks + s3 data lakes for deep analysis and IR

- maybe swap in panther SIEM for XDRs, if they get better out of the box

danielodievich · on Sept 21, 2023

Observe Inc. is disrupting this just in that kind of way already. https://www.observeinc.com/blog/how-observe-uses-snowflake-t... describes how.

manicennui · on Sept 21, 2023

ElasticSearch by itself is not a Splunk replacement except in very simple use cases.

jensensbutton · on Sept 21, 2023

Snowflake... is not cheap.

avrionov · on Sept 21, 2023

Snowflake is not cheap, but they had the right idea to separate the compute and storage.