Hacker Newsnew | past | comments | ask | show | jobs | submit | mpercival531's commentslogin

> Is App Engine still preferable purely for cost predictability?

How is Cloud Run cost not predictable?

It is fairly simple arithmetic in a spreadsheet to estimate the upper bound with # of max instance times the unit prices times the per-instance resources. Exactly the same as you do with App Engine standard & flexible environment.

> budget caps

Doesn’t exist in GCP and most cloud providers. You can fix the usage or hard cap the usage autoscaling, but not the spend incurred by the usage.

> What guardrails work that don’t depend on constant manual billing checks?

Start conservatively with max instances and instance resource, and iterate based on the actual performance and needs. Say, you know, put the number 1 in everything.

Do your capacity planning and cost estimation and understand them. “Solo dev” or not, you need these things to run the business. The root cause was that you outsourced your business and budgeting decision to LLM without verifying it and understanding it.


How iOS Mail gets push inbox updates working with third-party IMAP servers is in the public since 2015/2016 if you look hard enough. That has nothing to do with JMAP the protocol inherently.


It is “notify to pull”.

IMAP servers through APNS ping the iOS Mail about updates in certain pre-registered inboxes. Then iOS Mail re-fetches those inboxes.

The change signal is pushed; the data (inboxes/emails) aren’t.


iOS Mail app does not support IMAP IDLE like any ordinary email clients.

It only supports a proprietary IMAP extension that uses Apple Push Notification Services (APNS) as a sideband channel for IMAP servers to signal the iOS Mail app.

Last I researched this (like… years ago), most IMAP-based email providers that are listed by the iOS Settings have implemented the extension, except for Gmail and Exchange. Fastmail then got on the train since 2015.

Not sure what is with the tweet targeting Fastmail specifically though.


Fastmail have been singled out as they appear to have been given special treatment in the form of an APNS topic ID. Other hosts have been using a reverse engineered endpoint to generate certificates, which has recently been closed.

There’s some discussion on the Apple developer forums - https://developer.apple.com/forums/thread/778671. The solution for the OP there seems to have been they also will get special treatment, but there remains no route for others to use to get the same.


Ironically, Google does use APNS to support its own Gmail app on iOS, and for many other reasons. Just not for IMAP.


They are. Strix Halo is going after that same space of Apple M4 Pro/Max where it is currently unchallenged. Pairing it with two 64GB LPCAMM2 modules will get you there.

Edit: The problem with AMD is less the hardware offerings, but more that their compute software stack historically tends to handwave or be very slow with consumer GPU support — even more so with their APUs. Maybe the advent of MI300A will change the equation, maybe not.


I don't know of any non-soldered memory Strix Halo devices, but both HP and Asus have announced 128GB SKUs (availability unknown).

For LLM inference, basically everything works w/ ROCm on RDNA3 now (well, Flash Attention is via Triton and doesn't have support for SWA and some other stuff; also I mostly test on Linux, although I did check that the new WSL2 support works). I've tested some older APUs w/ basic benchmarking as well. Notes here for those interested: https://llm-tracker.info/howto/AMD-GPUs


Thanks for that link. I'm interested in either getting the HP Mini Z1 G1a or an NVidia Digits for LLM experimentation. The obvious advantage for the Digits is the CUDA ecosystem is much more tried & true for that kind of thing. But the disadvantage is trying to use it as a replacement for my current PC as well as the fact that it's going to run an already old version of Ubuntu (22.04) and you're dependent on Nvidia for updates.


Yeah, I think anyone w/ old Jetsons knows what it's like to be left high and dry by Nvidia's embedded software support. Older models are basically just ewaste. Since the Digits won't be out until May, I guess there's enough time to wait and see - at least to get a sense of what the actual specs are. I have a feeling the FP16 TFLOPS and the MBW are going to be much lower than what people have been hyping themselves up for.

Sadly, my feeling is that the big Strix Halo SKUs (which have no scheduled release dates) aren't going to be competitively priced (they're likely to be at a big FLOPS/real-world performance disadvantage, and there's still the PITA factor), but there is something appealing about about the do-it-all aspect of it.


DIGITS looks like a serious attempt, but they don’t have too much of an incentive to have people developing for older hardware. I wouldn’t expect them to supor it for more than five years. At least the underlying Ubuntu will last more than that and provide a viable work environment far beyond the time it gets really boring.


If only they could get their changes upstreamed to Ubuntu (and possible kernel mods upstreamed), then we wouldn't have to worry about it.


Getting their kernel mods upstreamed is very unlikely, but they might provide just enough you can build a new kernel with the same major version number.


Who said anything about Ubuntu 22.04? I mean sure that's the newest release current jetpack comes with, but I'd be surprised if they shipped digits with that.


Doesn’t DGX OS use the latest LTS version? Current should be 24.04.


I wouldn't know. I only work with workstation or jetson stuff.

The DGX documentation and downloads aren't public afaik.

Edit: Nevermind, some information about DGX is public and they really are on 22.04, but oh well, the deep learning stack is guaranteed to run.

https://docs.nvidia.com/base-os/too


> Pairing it with two 64GB LPCAMM2 modules will get you there.

It gets you closer for sure. But while ~250GB/s is a whole lot better than SO-DIMMs at ~100GB/s, the new mid-tier GPUs are probably more like 640-900GB/s.


LINE did.


Line failed in the Korean market, and only penetrated Japan if I remember correctly.

And it is also partially owned by Softbank.


SoftBank took a stake in Line way after Line became established


I did not know that.

Would that indiciate that Korean software companies are only able to penetrate one economy at a time?

That would be a very weird, but interesting thing to investigate.


Each of language groups across the globe has its own dominant and different messaging apps. US has Messenger, Korea has KakaoTalk, Japan took LINE, China built WeChat, Russia picked Telegram, and so on. The Meta Facebook/Messenger/Instagram triad isn't the global default of social apps the way it might look to people from US.

And I don't think it takes conspiracy theories to explain it, maybe users don't like platforms that isn't dominated by similar users of their primary language, or maybe there are something else that prevent app experiences optimized for two distinct cultures at the same time.


This isn't really true. WhatsApp was used pre-acquisition and continues to be dominant throughout LATAM, Africa, and Europe in addition to US/NA. Only in the APJC region and Russia do we see significant divergence in messaging apps.

Having traveled extensively in these places, I always theorized it was due to UX behavior aligning well with the local languages. While the countries WhatsApp dominates speak different languages, they all use the Latin alphabet. In Russia and APJC there are many non-Latin alphabets used and those languages may also use different directions for writing/reading than Romance and Germanic languages.


India loves what’s app. I’d like to know what the distribution of Latin vs other characters used as I’ve seen plenty on Indian languages romanized.


One advantage of Telegram over WhatsApp is that you don't have to display your phone number to your contacts and random people in group chats and blogs.


> Russia picked Telegram

With some amusing exceptions: doctors are exclusively on WhatsApp; older (60+) people are often only on WhatsApp (and pre-Microsoft Skype before that).


Not sure what you are getting at, but Line is deeply penetrated into South East Asia as well


Last I checked, 90% of Line users were in Japan, and Facebook messenger was most popular in SEA.

So I am simply surprised. My knowledge must be incredibly out of date.


LINE is very popular in Thailand for unclear reasons, I've heard the theory that their cute sticker packs set them apart in the early days. In the rest of SEA Whatsapp is the most popular.


Taiwan too.


Listener callout does not have to be part of the synchronous writes. Say if you keep a changelog, listeners can be asynchronously notified. These checks can be batched and throttled as well to minimize call-outs.

I would suspect this is what Google does with Cloud Firestore.


If you’re throttling, then your database can’t actually keep up with broadcasting changes. Throttling only helps ensure the system keeps working even when there’s a sudden spike.

Batching only helps if you’re able to amortize the cost of notifications by doing that, but it’s not immediately clear to me that that there’s an opportunity to amortize since by definition all the book keeping to keep track of the writes would have to happen (roughly) in the synchronous write path (+ require a fair amount of RAM when scaling). The sibling poster made mention of taking read / write sets and doing intersections, but I don’t think that answers the question for several reasons I listed (ie it seems to me like you’d be taking a substantial performance hit on the order of O(num listeners) for all writes even if the write doesn’t match any listener). You could maybe shrink that to log(N) if you sort the set of listeners first, but that’s still an insertion speed of m log(n) for m items and n listeners or O(mn) if you can’t shrink it). That seems pretty expensive to me because it impacts all tenants of the DB, not just those using this feature…


They have abandoned it at their lowest to focus on Zen. Now they seem to start picking up the slack. Upcoming Instinct MI300 APU brings HBM as unified system memory, together with hardware cache coherency between CPU cores and GPUs within and across NUMA nodes.


Each process has its own GCD queue hierarchy that are executed by an in-process thread pool. Though it has some bits coupled with the kernel for stuff like Queue/Task QoS class -> Darwin thread QoS class and relatedly priority inversion.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: