I've posted about this before, but I know someone whose coworkers actually tried to defend themselves with "not leaking memory wasn't in the requirements," in a meeting with several managers, the subject of which was the memory leak.
I don't know whether its true but there's an old story of engineers a few decades ago working on the guidance system for a missile.
The programmers didn't bother to free() any of their memory - instead they just calculated the maximum amount of RAM their program could consume while the rocket was in the air, doubled it, then installed that much ram in the rockets.
Am I the only one who is terrified by reading an "amusing story" about "on-board software for a missile"? The amusement sort of requires us to completely ignore the individuals being blown to pieces.
That's normal, as the rockets don't run an OS, run a specific , very explicitly defined, embedded program, with known states.
It's not going to malloc randomly...
In fact, it's common in real time applications to pre-allocate everything you need at the start of the program or use static memory, and never allocate afterwards.
I think the ultimate fact here is that you have a specific runtime period. Once the missile has reached its destination, you have zero memory and the runtime disappears together with the missile.
When the European space agency designed the Ariane 4 rocket allocated 16 bits to some variable relating to its horizontal velocity. This variable was only used during the first minute of flight, and engineers computed that it could not possibly overflow given the launch profile Ariane 4 used. Because of this and limited processing power, engineers omitted an overflow check. This worked fine, and Ariane 4 would go on to launch succesfully 113/116 times (97%).
Ariane 5 used a different launch profile. The impossible was now possible and the 16 bit value overflowed. The inertial reference system failed, along with its backup (which was running the same code), causing the rocket to receive wrong data.
At the same time, having added big uppercase warnings in the Ariane 4 source code and architecture docs, might have prevented this.
So, important, in safety critical code, to assume that others might blindly copy paste in bad ways -- and then try to warn them, although you might not be there any longer
This sounds distressingly similar to the Therac-25 failures. A radiotherapy machine who's software was (blindly?) copy-pasted from a previous version of the system which had hardware interlocks to prevent unsafe configurations of the radiation emitter.
The new one did not have these interlocks, instead doing everything in software. The result of this was a few people being fatally irradiated when a certain race condition was triggered.
Eminently reasonable: dispensing with memory management (allocate/gc) for an embedded real-time missile guidance system. It may have even had static memory layout which likely further improves performance.
More common than you think and a good way to do it. In a hard-real time environment where you need strong guaranties provably overshooting is a sane strategy. Freeing is not free. Missiles are extreme because they never will be reused but you can do the same at the process level. You just give the process more than it will need during its lifetime and don't bother with memory management.
I agree -- and I'd venture to say that the sentiment could be alluding to the fact that mistakes and things overlooked in software development are inevitable.
Don't take this for defending him, but it's in the realm of possibilities ¯\_(ツ)_/¯
Well his point is this technically isn’t a bug because comparable performance to fs.readFile was never guaranteed as part of the API contract. I don’t think his statement implies we generally ought not to expect excellence nor that effort shouldn’t be spent on improving things.
I think his point is valid even if slightly tongue in cheek. Bugs are defined by the contract / specification not wishful thinking.
As engineers, we should be capable of critical thinking beyond this. Otherwise you will end up with barely working enterprise software like the sibling comment mentioned.
If this is how you do your work, you are asking for a computer to replace you! We should aspire to be better than this. It’s crazy to me that people take massive salaries and expect other people to spoon feed their tasks to them.
>It’s crazy to me that people take massive salaries and expect other people to spoon feed their tasks to them.
I remember being one of two software engineers working on a project in a small company. We were cooperating with two bigger companies delivering other parts of the system.
I kept asking myself: "These guys have hundreds of well paid engineers working on this, and they never deliver! Are they sabotaging the project?".
In the end we delivered the finished prototype and the customer ran a 48h acceptance test, their parts fizzled out almost immediately, while ours just kept running.
It took me a few years and joining a bigger company to notice that thinking about what you're doing is not expected in bigger corporations, where everything is bureaucratized. Also if the company is big enough it can shift the deadlines set by its customers. It drove me near madness whenever a colleague said something "wasn't their job". I never heard such an attitude before, and these people were paid twice as much as in my previous company.
I still, to this day, hope that a lot of these people get replaced by computers. Defective programs can at least be improved.
Agree with this sentiment. I had a short stint at a large tech company and the sort of attitude you describe is what turned me off from working there longer.
“Not my job” people tend to be the most frustrating folks to work with. While I understand wanting to mainly do the work you were hired for, there are always going to be times where you have to roll up your sleeves and do something different.
I wish I was the kind of person who would thrive in that environment because it sounds nice, but I’m not wired that way.
I'm a large tech company employee, and I use "Not my job" sometimes because it means that I believe that the issue could be handled more effectively by someone else, or that I should not be the one to address the issue for political reasons. So it's less of a "not my job" and leave it at that, but more like "not my job, but Jane over on that other team is closer to that code and has a better understanding of what's going on."
I was speaking more of different departments giving the run around pointing to each other in a circle and saying "not my job". The problem I see here as that there are many things that need to be done and rarely is the coverage of "job descriptions" 100%.
Rant: I've worked with a few where a simple "not my job" is actually an improvement. I've had "not my job, you figure it out" when asking for assistance from others, which is weirdly worse.
"When I pass X in, I'm not getting anything back - no error, I can't see the logs. Can you assist?"
"Not my job - you figure it out. We can't be expected to know how everything works."
"Yes but... this is literally something your team wrote in the last 3 weeks to support the larger project we're all on, and I was told to make my code work with yours. Have you ever tested taking input? If so, what am I supposed to pass in to your system?"
"Not my job - we're not here to babysit you - go read the code."
Later... someone else...
"Why are you spending all the time trying to trace through this code? You just have to ask. You have to learn to talk to people."
I sent over screenshots of the previous 'conversation' and got nothing back.
Thankfully this doesn't happen very often, but I'd realized that this org had segmented some teams so poorly, and put such deadlines on, that the culture of half the teams was "do minimal work and throw it over to someone else". It was more confusing because I interacted between a few internal teams, and some teams literally did not believe other people behaved this way. Even with written evidence, the problem was always assumed to be "oh, you're just not asking the correct way".
It seems to be magnified on the front end. The name of the game is “easier”, everything else be damned. Try take someone’s pet framework away (or jQuery).
> It’s crazy to me that people take massive salaries and expect other people to spoon feed their tasks to them.
I’m not sure where in my comment you got the idea that I was advocating for engineers to have their tasks spoon fed to them by other people. My comment is simply about reading comprehension and not arguing against strawmen.
It’s also how great software is written. Assuming a subcomponent behaves in a way that it does not claim to isn’t a recipe for functional software, enterprise or otherwise.
It wasn't categorized as a bug, but as a speed issue.
Do you categorize thinking that JSON parsing takes at most O(n log n) time as wishful thinking as well?
It's humanly impossible to code for contract, that's why generally it's important to strive for no surprises in an API. This is the only way to build more and more complex software over time.
I don't know, I might say it as a joke, therefore seeing it said I assume they say it in the same manner as I would with the sound of innocent wonder that can be used to denote sarcasm.
I think as a general rule it is good to assume the best, even though I myself might not always be able to do so because of various limitations of understanding.
How can the native implementation be much more complex than that, beyond maybe some argument assertions? That will take almost the exact same amount of time as readFile, the additional time cost is a small constant value.
I believe the difference here is that these newly introduced promise wrappers allow you to pass an AbortSignal instances to them. Wrapping the old callback style methods would not allow you to "abort" in the middle of reading a file. You could respond to a passed in AbortSignal instance in the simple `readFilePromise` function you provided by simply rejecting early, but it wouldn't free up the underlying file descriptor until after the entire file had been read.
Actually I was completely wrong. They added support for consuming an AbortSignal in the old callback style methods as well. Now I'm at a loss as to why they even have two different implementations instead of just doing something like:
I'm brand new to the code, but I don't think it is as simple as you state.
The 'one function' you linked to just passes a ReadFileContext into binding.open[1]. The ReadFileContext appears to have its own read that uses its own allocated Buffers[2] under some circumstances, but under others they are mysteriously already there[3]. Then binding.open refers to an internalBinding of fs[4], which I'm unsure of, is that js or C++ code? This[5] suggests C++ is involved, but there is also an open in lib/fs.js[6] with that signature, though it also just calls binding.open at the end with the same signature, so I think it's off into C++ by then. At that point, I lose the trail. Maybe C++ is even doing the allocating.
edit: I was curious so I kept reading, It seems like open is probably C++ here[7], which makes an AsyncCall or SyncCall to "open", which are just wrappers around syscalls[8]. Still not sure about other places the buffer comes from, though.
On a SSD reading files is fast. However read() is a blocking operation that could take forever - reading some /dev/ device that never returns any data, or a slower medium that takes hundreds of milliseconds per read. That’s a big no-no. So it gets dispatched to a dedicated blocking-io threadpool with limited workers which does the blocking operation without interrupting the event loop.
I’m guessing as disk speeds have increased the relative overhead of dispatching the task to a thread pool has also increased.
At least, that’s how I think it works based on some half-remembered blog post.
I can fully believe that there's some cleverness going on with node file I/O - but there should be no meaningful difference in performance between returning the result via a Promise, versus a callback.
It depends - I would assume the callback pattern is much easier to JIT than the promise pattern and requires less objects to be allocated, but you’re right. That’s a huge difference.
You’d need to benchmark another non-io bound function to find how much overhead comes from promises themselves.
It depends. If disks got so fast that function call overhead starts to dominate, then you might see an effect like this. But I have to imagine reading from an SSD is still 1000x slower (at least) than a function invocation in Javascript.
"My guess is that the promise version is idling when IO capacity is available due to scheduling? Is it implemented in an event driven way internally, and if it is (for example using epoll) is the epoll event able to interrupt and cause the promise to continue to be evaluated or does it have to wait for the vm to get around to scheduling it again?"
It didn't occur to me that the async promise is rescheduled. That right there is a concern.
The OP also points out buffer concatenation, which is an odd choice. Reading the file into memory in one go should start with an stat and then a single buffer alloc. But that is probably naive.
Network filesystems, compressed filesystems, and of course the ever-dreaded on-access virus scanners on Windows that 'break' one open source project after another. Who knew that opening the same file 10 times is slow? Everybody. Everybody knew that.
> Instead of creating a new Buffer for each chunk, it could allocate a single Buffer and write to that buffer. I don't think Buffer.concat or temporary arrays are necessary.
is already what the code does. Multiple chunks and Buffer.concat are only involved if more data gets read than stat initially reports.
I did just because you asked, and mine (readFileP) is faster, barely different than the callback version. Actually it's the fastest one which makes no sense, so I think this benchmark may have quite a bit of error.
Results:
fs.readFileSync x 696 ops/sec ±1.96% (76 runs sampled)
fs.readFile x 2,045 ops/sec ±6.43% (72 runs sampled)
fs.promises.readFile x 435 ops/sec ±1.76% (78 runs sampled)
util.promisify(fs.readFile) x 2,053 ops/sec ±6.58% (67 runs sampled)
readFileP x 2,182 ops/sec ±4.00% (67 runs sampled)
fs.promises.readFile CACHED readFile x 438 ops/sec ±1.56% (79 runs sampled)
Fastest is readFileP,util.promisify(fs.readFile),fs.readFile
Slowest is fs.promises.readFile,fs.promises.readFile CACHED readFile
index.js:
const Benchmark=require("benchmark"),fs=require("fs"),path=require("path"),chalk=require("chalk"),util=require("util"),promisifed=util.promisify(fs.readFile),bigFilepath=path.resolve(__dirname,"./big.file"),suite=new Benchmark.Suite("fs"),readFileP=(...e)=>new Promise((i,r)=>fs.readFile(...e,(e,l)=>{e?r(e):i(l)})),readFileC=fs.promises.readFile;suite.add("fs.readFileSync",e=>{fs.readFileSync(bigFilepath,"utf8"),e.resolve()},{defer:!0}).add("fs.readFile",e=>{fs.readFile(bigFilepath,(i,r)=>{e.resolve()})},{defer:!0}).add("fs.promises.readFile",e=>{fs.promises.readFile(bigFilepath).then(()=>{e.resolve()})},{defer:!0}).add("util.promisify(fs.readFile)",e=>{promisifed(bigFilepath).then(()=>{e.resolve()})},{defer:!0}).add("readFileP",e=>{readFileP(bigFilepath).then(()=>{e.resolve()})},{defer:!0}).add("fs.promises.readFile CACHED readFile",e=>{readFileC(bigFilepath).then(()=>{e.resolve()})},{defer:!0}).on("cycle",function(e){console.log(String(e.target))}).on("complete",function(){console.log("Fastest is "+chalk.green(this.filter("fastest").map("name"))),console.log("Slowest is "+chalk.red(this.filter("slowest").map("name")))}).run({defer:!0});
I agree, the error in their claimed deviation isn't accurate. You would need to run many trials across many devices at different times of the day for a practical benchmark; it could be a service itself.
I do think the data here is enough to know that a promise wrapper here is insignificant in the grand scheme of things. This fs.promises library must do way more under the hood.
This doesn't compare async to sync: in fact, `readFileSync` is also benchmarked here, and is slower than every async alternative. This compares `readFile` (async) with `promises.readFile` (also async).
,,But no one said fs.promises.readFile should perform similarly to fs.readFile ''
I'm happy that it was downvoted at least, but there should be a better way to set expectations for excellence in development.