Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Literature review on the benefits of static types (2014) (danluu.com)
43 points by lemper on Oct 7, 2023 | hide | past | favorite | 69 comments


I don't know how you can miss "To Type or Not to Type: Quantifying Detectable Bugs in JavaScript" which is frankly the only study I've seen that cleanly separates out static types as a variable.

They found Typescript / Flow find 15% of JavaScript bugs in GitHub projects.

Decent, but of course not the full story. It misses out other benefits like:

* Robust refactoring

* Easier to understand code

* Easier and faster to navigate code

* Faster to write code (e.g. due to better IDE tooling)

* Having to write fewer tests.

In my experience catching bugs in production code is probably the smallest benefit of static types - production dynamically typed code has generally found low hanging bugs - painfully! - by hitting them at runtime.


I think I'd need to see some studies on those "benefits" to believe them- I do really like intellisense working with typescript sometimes saving me a click or two on viewing another file (when I can just over over something and see a little description of the types)... but I've also lost a ton of time trying to fight with a complex type when I know what I want to accomplish and could have done it more trivially with vanilla JS. You might catch more bugs because you spend more time on writing the code, and the ratio might be positive or negative compared to just spending that time in dynamic types.

You can slap an "any" here and there for that but it's not always obvious where to or if that will cause problems down the line, and some purists will say you should never use any.


> I've also lost a ton of time trying to fight with a complex type when I know what I want to accomplish and could have done it more trivially with vanilla JS.

Type systems are an investment to learn. The good news is that the knowledge is transferable. Statically typed languages borrow a lot from each other.

For TypeScript, I would highly recommend Type-Level TypeScript [0] which teaches programming in the TypeScript type system. After going through that, I was able to understand more advanced types and a lot of the magic is peeled away. I can read types from libraries and understand what is going on, and, when needed, I can write sufficiently flexible types for our application that gives my applications a _lot_ of safety.

---

As an aside, using `any` gives you the worst of both worlds: you have to fight with a type system, and the type system isn't even doing anything. For TypeScript, Zod [1] (or any other validation library) fixes often can help with a lot of casts, e.g. when parsing API responses or other data coming from outside of your type system.

[0]: https://type-level-typescript.com/

[1]: https://zod.dev/


> As an aside, using `any` gives you the worst of both worlds: you have to fight with a type system, and the type system isn't even doing anything

Not sure I agree there. You can use `any` judiciously so that don't have to fight the type system where it's too difficult, and yet still get the benefits elsewhere in your program.

I think it's actually one of the biggest benefits of dynamic typing - in rare cases you can use `any`. Most static type systems don't really let you do that. (I think Dart is one of the rare exceptions because it started dynamically typed and switched to statically typed.)


If you're using `any` just because the type is complex, then using it sparingly isn't a huge deal, so long as you are confident that the type is correct.

The problem is using `any` when you _don't_ actually know the type of something, e.g.

``` const data = JSON.parse(localStorage.get("my_data")) as any as UserSettings; ```

In the above case you're just hoping that what's in local storage matches your type, and it makes your type system less effective.


Related:

The evidence behind strong claims about static vs. dynamic languages - https://news.ycombinator.com/item?id=16287083 - Feb 2018 (1 comment)

The empirical evidence that types affect productivity and correctness - https://news.ycombinator.com/item?id=8594769 - Nov 2014 (25 comments)

Also a couple references in this recent thread:

Strong static typing, a hill I'm willing to die on - https://news.ycombinator.com/item?id=37764326 - Oct 2023 (847 comments)


When doing a literature review, it usually helps to at least state the date of the literature you are reviewing.

For example, when is "Comparing mathematical provers; Wiedijk, F" from? You can take a guess by looking at the literature references in the paper, or googling it, but it would be good to have a date anyway (2003). There is also a follow-up book from 2006 (https://link.springer.com/book/10.1007/11542384) about this topic.


This is also helpful when an author publishes multiple works under the same name. I’ve seen this happen before where a paper was published with a name and then the author followed it up with a book by the same name. The information from the first work is probably in the second work, but that’s not necessarily the case.


> Advocates of dynamic typing may argue that rather than spend a lot of time correcting annoying static type errors arising from sound, conservative static type checking algorithms in compilers, it’s better to rely on strong dynamic typing to catch errors as and when they arise.

Why is that a strange statement?

Edit: Of course it is a perfectly valid statement, advocates of dynamic typing may indeed argue that. "It's better" depends on your context, of course. If you have 100 mediocre programmers taking turns on the same code, it's probably not better. One of the great things about TypeScript is that you can decide on the spot when "it's better": Usually it's better when you otherwise have to make a hand stand in the type system to express what you want.


The later you detect a bug in the development process, the more expensive it is to fix.

If every type mismatch causes a program crash then you still have real problems with a production crash. Your user has a degraded experience. Somebody gets alerted about the crash. Somebody needs to investigate it. In a language like python, you are often just stuck with a message that says that your variable doesn't have the right function so you've got no idea what wrong type it is and where it came from. You track it down, diagnose the issue, and push a fix.

It can be worse in a duck typed world, where you aren't even guaranteed to crash. Your program might just fly off and do completely wrong things. Or you might be working in a domain where crashing is unacceptable.

In a statically typed world, your compiler yells at you and says "you are passing an X here when it expects a Y" and you fix the issue before the code ever runs.

There are downsides of the latter approach, of course. "Ugh, I've got a FooMap and I need to pass it to a BarMap and these types are incompatible" is aggravating. Attempts to fix this have not always been great and we've got mountains of Java code where nobody has any clue where the real code is because everything is typed as an interface, for example.


I want to raise a point of order here.

We do not have any good research backing "bugs found later in the process cost more to fix".

If anything, the few empirical and solid dataset we have show no correlation, to at best really light one.

This principle may sound solid in theory, but the evidence is not here to support it and if anything seems to point to its invalidity.

See in particular https://shape-of-code.com/2021/06/27/increase-in-defect-fixi...


I think in this case it's probably true for run of the mill bugs, but not for architecture and system design.

But I just want to object to "we don't have any good evidence". This is the sort of thing that it's really hard to get scientific evidence for, but that doesn't mean we can't learn about it.


I mean the problem is that we have quite good evidence of how to handle architecture and system designs problems. And it is not by finding them earlier.

But by reducing their costs through looser coupling and incremental work. That is where all kind of agile and DevOps research showed.


> we have quite good evidence of how to handle architecture and system designs problems

We do? I seriously doubt that. New good and bad architectures are popping up all the time. And there are plenty of architectures that people totally disagree about, e.g. microservices.

> But by reducing their costs through looser coupling

In my experience loose coupling is something to be generally avoided where possible. It leads to spaghetti systems and unknown data flows. Pretty much the dynamic typing of architecture design.

I'm not exactly sure what point you're trying to make though so I may have misunderstood...


They may not cost more for us devs, but does it factor in the time our customers are not doing what they need to do because of that bug?

I mean I fixed a trivial-to-fix bug the other day that probably Rust would have caught. Between the time it took for them to report it, support doing their thing and a new build was out it took an hour. An hour that our customer couldn't do what they needed to do for their work.

So I'd say it's almost trivially true that a bug caught before release costs less to fix.


This is a very strange statement on a site designed by and for engineers. We’ve all worked on large projects. It can take an annoying few minutes to get a static type check error fixed. We’ve all spent days or even weeks tracking down weird random runtime errors due to type mismatching.

The plural of anecdote is not data, but this is not a science website either. You don’t need a study to establish engineering common sense.


And yet the scientific study shows that these anecdotes are indeed anecdotes and seemingly not true.


I've grown to dislike this statement. The worst case cost of fixing a problem late is bad. The vast majority of bugs are not that, though. Some are cheap no matter when you find them.

That said, I suspect many people were used to static types that required a lot of support to name everything. In those, it was common for changes to be rather extensive in what had to change. Just look at early Java EE stuff with tons of interfaces and text config for what is commonly a single "pojo" for a while now.

Now, I don't think this is an argument against static tools. Tools are tools. If you have a specification that you can encode into types easily, do so. If the type already exists, use it. If you are exploring, take care not to encode into the type a runtime feature of the data.


Dynamic typing aka type inferencing (like in Python, but also in Java and C# when you use "var") can get you in trouble quickly. Consider this code:

   var fireable = someMethod();
   firable.fire();
The programmer intended this code to fire an employee. Let's say this code is in a military application, and another programmer modified the someMethod() function to return a missile. As long as the missile object has a fire() method this code will compile just fine... and do something the code didn't intend to do, i.e., fire a missile!

How likely are you to have employee firing and missile firing in the same program? Not very likely, but the principle is valid, nonetheless. You need to express your intention more clearly like this:

   Employee fireable = someMethod();
Now if someMethod() is modified to return a missile you get a compilation error, and the world will be a lot safer. You don't want missiles being fired by accident!


"Dynamic typing" refers to types which can be determined only at runtime, not at compile time.

"Type inference" is what you're referring to, and is an optional feature of static type checkers. For instance, C has static types but no type inference, while C++ has static types with type inference. The concept of "type inference" doesn't make sense for dynamic typing.


> Dynamic typing aka type inferencing

Dynamic typing and type inference are not synonymous.


Both have the same problem. I should have said Dynamic typing and type inferencing...


Maybe don't name your methods as vaguely as "someMethod()", name it something like "Employee()", or "Missile()" depending on what is needed in the program. That way the vagueness of "someMethod()" doesn't cause unintended consequences.


Fireable already implies that the result of somemethod() is something fireable. That is why op uses somemethod(). Maybe a better name would be getSomethingThatCanBeFired() but that would not really explain OPs point better.


Naming doesn’t fix unintended consequences. If you are the only person to ever work in the codebase and you never hire anyone new and you have a perfect memory… sure i guess it works than.


static typing also doesn't fix unintended consequences. In the example given above, naming things descriptively and applying the smallest bit of common sense would achieve practically the same thing as static typing.


Sure but it finds compile time errors a whole lot easier. And when modifying code i can say from experience it’s much smoother to modify large statically typed projects than large dynamically typed ones.

Actually anyone arguing against that i don’t even think it’s worth my breath since you likely just don’t have much experience and really like whatever your pet language is


>Actually anyone arguing against that i don’t even think it’s worth my breath since you likely just don’t have much experience and really like whatever your pet language is

Thanks for the ad hominem attack, but I've been coding in C, C++, C#, python, javascript, php and a dozen other languages, as well as assembly language which doesn't even have types at all, for 40+ years. Maybe your ad hominem attacks work on other social media sites, but not here kiddo.


Better yet, avoid methods and classes when possible in favor of functions that act on data structures, which live in contextual directories and can be renamed so as to avoid confusion.


Changing ‘firable.fire()’ to ‘fire(firable)’ doesn’t solve the problem though.


I see this being downvoted over the difference between dynamic typing and type inference. I feel this is unjustified. The example shows what can go wrong with type inference, which is in fact a feature of dynamicly typed languages (even though it is not explitly called type inference in that context).


Type inference isn't really a feature of dynamically typed languages any more than emergency stop is a feature of hand saws.

It isn't applicable because there are no static types to infer.

Type inference can in some cases lead to type confusion in a way that can also happen with dynamic types. In my experience it is extraordinarily rare, especially in languages with only limited or local type inference like Rust or C++.


Today I learned: don't call type introspection type inference if you do not want your HN popularity to decrease.


Type introspection is another thing entirely! That's basically reflection, which is nothing to do with anything mentioned so far. To be clear:

* Dynamic typing: types are not known at compile time

* Static typing: types are known at compile time and written down in the source code.

* Static typing with type inference: types are known at compile time and inferred rather than being explicitly written down

* Dynamic typing with type inference: nonsensical; not a thing

* Type introspection: types can be inspected by code at runtime. Unrelated to any of the above. Can be found (or not found) in both dynamically typed and statically typed languages.


Thanks for the overview. I agree with all, except for the "has nothing to do with" claim.

> Dynamic typing: types are not known at compile time.

That is just a top level description of a feature, the tip of the iceberg. What is needed for a language to support dynamic typing? The answer is not "has nothing to do with x" where x is any of the above features.

Also a lot of types are known at compile time even if your language supports dynamic typing.

Also "compile time" is a vague term for an interpreted language, as the machine compatible code is generated at run time.


> programmers tend to read essays until they getting to the first line they disagree with and then tweet about it, like a compiler

https://twitter.com/tef_ebooks/status/1108101434141298695

I suspect many people didn't get past the first five words.

That said, I agree with you that the point is more legitimate, but I don't think it's a particularly strong argument.


If you think a method called ‘someMethod()’ is a good idea, then it will probably return a type called ‘Something’ and you are back to the same problem.


I can understand what you are getting at. However, a more real world example should clear this up slightly. Both the variable name and method name are not conveying meaning in the above example.

  val underperformingEmployee = selectLowestPerformingEmployee();
which can be reduced to

  selectLowestPerformingEmployee().terminateEmployment();

But there are already a lot of strange state management issues with this code style. I would expect something more like this in the wild

  terminateWorstEmployee(){
    var employee = hrService.lookupEmployeeByLowestPerformanceReview();
    employeeTerminationService.initiateTermination(employee.id);
  }

When things are named and abstracted properly, it should be very clear what is happening. var conventions won't save you here.

Assignment from method calls seems to be the most likely place where this could POSSIBLY cause confusion. Maybe we can compromise on variable assignment directly from constructors? That seems to be the clearest example where the extra text is purely noise. It also causes a lot of line wrapping which indirectly causes even more noise.


Use the right tool for the job.

If you need to barter with your compiler about the most efficient way to store data in memory, going for a strongly typed language is ... advisable.

Building a fintech solution that requires supporting ever changing data models over time alongside business logic that also changes over time: use a Lisp :)


Changes over time is a situation where I'd prefer static types. If I change the shape of some data, why would I ever want a system that didn't tell me what I just broke at compile time?


> ever changing data models over time alongside business logic that also changes over time

This is the exact case where a statically typed language shines. If you update your business logic or refactor, the compiler can let you know what needs to be updated for the new data model.


Well, the article mentions up front that people will debate the language categories, but Objective-C is definitely a dynamic, runtime-based language. #import <objc/runtime.h>, have fun, go to town. That being said, the compiler does behave rather statically with respect to type-checking, so fair enough I guess.

I feel strongly that dynamic typing leads to poor reliability, and in particular the special-case of "dynamic typing" that creeps into "static" languages: implicit nullability of all (or most, e.g. not Java primitives) types in static languages. The "billion dollar mistake". This is lost in these discussion, Java is "dynamically typed" insofar as every reference type is either what you think it is, or possibly another type that will blow up your entire application if you handle it in the wrong way.

I'm not an academic at all. I don't even have a CS degree. But, I've worked as a rather senior IC on applications with billions of users and with billions and billions of dollars in revenue, the usual "company everyone has heard of" stuff. Nullability has caused so many bugs for us that we could have caught with a compiler. These studies appear to focus on contrived cases or on smaller teams with small or even temporary codebases. This is very different in practice from a decade-old codebase with hundreds or thousands of contributors, some of whom will be interns hopped up on a bit too much free Red Bull. Putting up a wall "you cannot create this kind of bug" is hugely valuable for us, i.e. the transition from Java to Kotlin in Android applications, or Objective-C to Swift in iOS applications.

The article talks about "bugs" as if all bugs are equal, but they are not. Shipping a crash on startup in a mobile or desktop application is an absolute disaster. All hands on deck. Hopefully there's a way to mitigate it. Breaking one feature with some decently graceful recovery is also a "bug", but it is drastically different in terms of negative impact for the company.

This stood out in particular:

> ...CoffeeScript are rated as having few concurrency bugs

I mean, a language with that does not really support threads has few concurrency bugs? Did the authors understand what CoffeeScript is? (I am discounting web workers because AIUI CoffeeScript had largely fallen out of relevance before they were a thing) This is similar to saying Java has fewer memory safety bugs than C, well obviously it does! That is one of the things it was specifically designed to do!


Does coffee script support standard async constructs? Then you can have concurrency bugs. They may be more limited than the bugs you’ll see in languages that support parallel execution but they can still happen, all you need is to read a value at one time, and have any possibility it has been changed at the point you depend on it.


If CoffeeScript has few concurrency bugs relative to JavaScript, that’s actually quite an accomplishment. Completely barring threads, or any mechanism to achieve parallelism, JavaScript tends to deal quite a lot with concurrency.


Fair! They are comparing it to Erlang and Go, which are both languages designed with concurrency as a core principle. It just seems off, I guess. I can't actually read the linked article that talks about this as it's locked behind a login wall.


The strongest opinion I hold about programming (or maybe anything) is about type systems:

I don't want to work with you if you don't understand the benefits of statically typed languages.

Great things can be (and have been) built with dynamically typed languages, but there are only a few reasons to use them today:

* Elite teams who are experienced in the language

* Smaller projects, short scripts, etc.


The best developers I've worked with are all massive fans of static languages for anything important. All of them would use dynamic language too but it situations which allowed it (small tooling, glue-style scripts).

The biggest "dynamic only" have been people who I struggle to call developers. The kind of people who suddenly become massive supporters of something they read in "Smashing Magazine" that week.


I’d go a bit further: I use dynamically typed languages for small things all the time, like shell scripts or build system glue. But I would absolutely, 100% use a statically typed glue language instead, if one were available for me to use.


In these studies advocating for the unyielding rigor of statically-typed languages, it's amusing that the authors choose to express their arguments in English: a language where "read" and "lead" rhyme, but "read" and "lead" don't. Perhaps a more consistent constructed language like Lojban would have better suited the papers' theses. /s


Making a computer understand a different language takes one command to the package manager. In a human being that would take years, and never completely succeed. English is obviously deranged but this audience is stuck with it.


um, they both rhyme - read (i read a book yesterday) and lead (a metal). and then read (what i am doing about what i am typing now) and lead (what you do to take a pig to market).


and lead (what you do to take a pig to market).

That's a curious example. There's surprisingly little on pigleading out there!

https://www.google.com/search?q=%22lead+a+pig%22


Thank you for the English lesson, but you claim you've never encountered ambiguous overloaded identifiers that you couldn't resolve while developing a parser? "He took the lead"?


of course that's ambiguous - english is not context free


Then you can't claim they both rhyme, because I didn't provide that context...


How are they doing double blinded experiments on typing?


It is faster not to spend the time setting up a proper protocol beforehand.

With dynamic experimentation, you get your results sooner, which means you have more time fix them if they're wrong.


Where does it say any of the experiments were double bind?


It didn't. Hence my question, as that's the gold standard for science. How is this validated? How do they design the experiments when they do science on programming related stuff?

I don't get my amount of downvotes for a genuine question based on curiosity. I'm just interested in how science is done in this type of research.


Lots of excellent science is done without double-blind experiments. Not all experiments need double blind. All of the Pluto flyby researchers know their data comes from Pluto.

You'll mostly see the phrase 'gold standard for science' in biomedical and social science fields where subjective evaluation plays, or may play, an important role.

While these are studies are definitely in social science, they mostly stuck with objective values which cannot easily be affected by an un-blinded experimenter.

Here's what a quick scan of some of the papers, which you could have done yourself.

The paper "Static type systems (sometimes) have a positive impact on the usability of undocumented software" uses randomly assigned test subjects and a measurement - time - which does not have a subjective component that would be affected by blinding the experimenter.

The same with "How Do API Documentation and Static Typing Affect API Usability?", which also looked at other objective parameters.

"An Experiment About Static and Dynamic Type Systems" also used time, as well as the ability to pass a set of test cases which were uniformly applied. They considered using code reviews, but the subjectiveness requires a large number of reviewers to hope to get an interpretable number. Had they gone the code review route, yes, I think the reviewers would also need to be blinded.

"Work In Progress: an Empirical Study of Static Typing in Ruby" was a pilot study which had no experimental numbers, and was mostly meant for hypothesis generation and working out kinks in the study protocol.

"Haskell vs. Ada vs. C++ vs. Awk vs. ... An Experiment in Software Prototyping Productivity" used LoC, development time, and subjective reviewers, noting the issues with subjective reviews, but curiously omitting the low statistical confidence. Blinding wouldn't have made a difference in the reviews as all programs but Haskell were written in a different languages. (Two were in Haskell.)

FWIW, double-blind studies are not immune to p-hacking, for that you need a preregistered study. I guess a pre-registered triple-blind study would be platinum standard?


Double blinds are not the gold standard for science. They are not even the gold standard for medicine, although for drug studies in particular they have great utility. They actually have quite limited applicability outside of a particular set of circumstances.

The double blind study is a special construct created to deal with the confounding effect of placebos, which really isn’t a thing outside of medicine.


So blinding is pretty much impossible for something like typing. The programmer can't be ignorant to whether the language is typed, it's central aspect of using the language.

The reason blinding works so well in medicine is because you don't consciously interact with the medicine's mechanism. Your body does. If you had to understand the shape of the molecules in a pill, say, for it to work, you wouldn't be able to blind.


Yes, you can. A language may not have explicit type declarations, but still have machine-inferable types.

"Work In Progress: an Empirical Study of Static Typing in Ruby" gives an example.

> In this paper, we present an empirical pilot study of four skilled programmers as they develop programs in Ruby, a popular, dynamically typed, object-oriented scripting language. Our study compares programmer behavior under the standard Ruby interpreter versus using Diamondback Ruby (DRuby), which adds static type inference to Ruby. The aim of our study is to understand whether DRuby’s static typing is beneficial to programmers.

I can imagine a similar experiment where both groups use Python, but one setup uses just Python and the other uses Python+mypy, to report type issues.

(The WIP paper points out how DRuby is a lot slower than Ruby, so users aren't blind to the effect. The Python experiment should probably run Python+mypy for both cases, but only report the mypy output for one case.)


Getting better diagnostics from secretly running a type checker as a "linter" is not the same as writing a program in a language you know is typed and which you design the types first before constructing the program.

I'm sorry but this seems uncontroversial to me, I don't think this negates the point I was making at all.


The linked-to literature review considers it a relevant paper. Take up your controversial issue with Dan Luu.

> not the same as writing a program in a language you know is typed and which you design the types first before constructing the program

You know that's not the only way to work with static types, right?

Furthermore, the same DRuby paper hypothesizes "that programmers have ways of reasoning about types that compensate for the lack of static type information". It suggests that programmers are designing with types first, in their head, even in programming languages which don't support types.

So I think you meant to write something more like "express" or "implement" the types first, not "design."


> I don't get my amount of downvotes for a genuine question based on curiosity.

It doesn't come across this way. It comes across as "this isn't legitimate because they aren't looking at studies that do double-blind experimentation."


Well, that's an uncharitable way of reading into it imo. Anyhow, I explained my question in a follow up (can no longer edit). Basically I'm curious on how to do proper science on things like these.


It was the same reading I had.

"How are they doing double blinded experiments on typing?" comes across as assuming double blinded experiments are the only acceptable way to a valid answer.

Why can't single-blinded, or even unblinded, give useful answers?

That is, why do you think any answer to your question would make a difference as to the quality of the results?

Epidemiology is an important field of science, even when based on observational and not experimental science. The answer to "How are they doing double blinded experiments on observational studies" is "they aren't."

Yet it's still good science.


Some areas of science use double blind studies a lot, others don’t. I assume you won’t get mad if I don’t use a double blind study when investigating a black hole, right? We use different types of studies for different types of science.

Double blind studies work for areas where the experimental subject can’t tell the difference between the different courses - for example, whether a drug relieves pain. A double blind study for static/dynamic typing is definitionally impossible, since the experimental subject must know whether the language is static or dynamic, and they come into the experiment with biases about that.

Now, you could do something like a double blind study, where the participants are assigned to program in a language without knowing what they’re testing. I hope that those types of studies are done! But that’s a different thing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: