Your “thoughts” are applying his solution to the wrong problem. “Start small and stay small” I’m not sure what that even means. Are you saying every service has to grow to some size or required amount of compute? LOL
The 100ms extra time is nothing. I mean - are you trying to solve at Google or Amazon scale?
I run simple Lambdas that read from some SNS topics, apply some transforms and add metadata to the message, and route it somewhere else. I get bursts of traffic at specific peak times. That’s the use case and it works well. The annoying part is Cloud Formation templates but that’s another topic.
You’re making some bizarre assumptions. Not everything is front end user facing.
Let’s say I’m processing messages off a queue. P90 @ 50ms vs p90 at 100ms doesn’t necessarily make a difference. What are my downstream dependencies? What difference does it make to them?
At the end of the day, value is what you care about - not necessarily chasing a metric because lower is absolutely better. What’s the cost of another x milliseconds of latency considering other trade offs (on going operational burden, extensibility, simplicity, scalability, ease of support etc etc).
If 50 ms latency means I can have a solution that can auto scale to massive spikes in traffic due to seasonality or time of day vs a reduction of that latency but I have to spend time capacity planning hardware and potentially holding extra capacity “just in case”, then again, optimizing for a single metric is pointless.
The 100ms extra time is nothing. I mean - are you trying to solve at Google or Amazon scale?
I run simple Lambdas that read from some SNS topics, apply some transforms and add metadata to the message, and route it somewhere else. I get bursts of traffic at specific peak times. That’s the use case and it works well. The annoying part is Cloud Formation templates but that’s another topic.