Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Aliasing (xania.org)
96 points by ibobev 16 days ago | hide | past | favorite | 35 comments


For a real world example of how this can affect code check out this commit I made in mesa: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20...


When you have done enough C++ you don't need to fire up compiler explorer, you just use local variables to avoid aliasing pessimisations.

I also wrote about this a while ago: https://forwardscattering.org/post/51


I think this might not be a shortcoming of MSVC but rather a deliberate design decision. It seems likely that MSVC is failing to apply strict aliasing, but that it's deliberately avoiding it, probably for compatibility reasons with code that wasn't/isn't written to spec. And frankly it can be very onerous to write code that is 100% correct per the standard when dealing with e.g. memory-mapped files; I'm struggling to recall seeing a single case of this.


AFIK MSVC has never implemented TBAA by design.


TBAA = type-based alias analysis


Thank you Rust for having aliasing guarantees on references!


Aliasing is no joke and currently the only reason why some arithmetic intensive code-bases still prefer Fortran even nowadays.

While it is possible to remove most aliasing performance issues in a C or C++ codebase, it is a pain to do it properly.


Aliasing can be a problem in Fortran too.

Decades ago I was a Fortran developer and encountered a very odd bug in which the wrong values were being calculated. After a lot of investigation I tracked it down to a subroutine call in which a hard-coded zero was being passed as an argument. It turned out that in the body of that subroutine the value 4 was being assigned to that parameter for some reason. The side effect was that the value of zero because 4 for the rest of the program execution because Fortran aliases all parameters since it passes by descriptor (or at least DEC FORTRAN IV did so on RSX/11). As you can imagine, hilarity ensued.


How does this bug concern aliasing?


In old school FORTRAN (I only recall WATFOR/F77, my uni's computers were quite ancient) subroutine (aka "subprogram") parameters are call-by-reference. If you passed a literal constant it would be treated as a variable in order to be aliased/passed by reference. Due to "constant pooling", modifications to a variable that aliased a constant could then propagate throughout the rest of the program where that constant[sic] was used.

"Passing constants to a subprogram" https://www.ibiblio.org/pub/languages/fortran/ch1-8.html


It's literally in the description? Because of aliasing, a variable that should've been zero became four.


It wasn't a variable.


It wasn't intended to be a variable, but it did become one. Its value varied, it's in the name.


But this is just Fortran's call-by-reference in action. It's not aliasing.


Is it? You just add "restrict" where needed?

https://godbolt.org/z/jva4shbjs


> Is it? You just add "restrict" where needed?

Yes. That is the main solution and it is not a good one.

1- `restrict` need to be used carefully. Putting it everywhere in large codebase can lead to pretty tricky bugs if aliasing does occurs under the hood.

1- Restrict is not an official keyword in C++. C++ always has refused to standardize it because it plays terribly with almost any object model.


Regarding "restrict", I don't think one puts it everywhere, just for certain numerical loops which otherwise are not vectorized should be sufficient. FORTRAN seems even more dangerous to me. IMHO a better solution would be to have explicit notation for vectorized operations. Hopefully we will get this in C. Otherwise, I am very happy with C for numerics, especially with variably modified typs.

For C++, yes, I agree.


Support for arrays without having to mess with pointers is pretty attractive for number crunchers too.


The whole series is excellent and as a non regular user of assembly I learned a ton.


I wonder how much potential optimisation there is if we entirely drop pointer nonsense.


Are you talking about dropping pointers as a programmer-facing programming language concept (in which case you might find Hylo and similar languages interesting), or dropping pointers from everything - programming languages, their implementations, compilers, etc. (in which case I'm not sure that's even possible)?


Only the first one. Ofc under the hood they will stay, but I think its time to ditch random access model and pull fetching and concept of time closer to programmer


This is basically what many functional programming languages do. This always came with plausibly sounding claims that this allows so much better optimizations that this soon will surpass imperative programs in performance, but this never materialized (it still did not - even though Rust fans now adopted this claim, it still isn't quite true). Also control over explicit memory layout is still more important.


Gah, can't believe I forgot about functional programming languages here :(

> even though Rust fans now adopted this claim

Did they? Rust's references seem pretty pointer-like to me on the scale of "has pointers" to "pointers have been entirely removed from the language".

(Obviously Rust has actual pointers as well, but since usefully using them requires unsafe I assume they're out of scope here)


What I meant is that Rust has stricter aliasing rules which make some optimization possible without extra annotations, but this is balanced out by many other issues.


Sure, but I think the presence/absence of aliasing is different from what GP was wondering/asking about, which was the removal of pointers from the programmer-facing model.


For a system programming language the right solution is to properly track aliasing information in the type system as done in Rust.

Aliasing issues is just yet another instance of C/C++ inferiority holding the industry back. C could've learnt from Fortran, but we ended up with the language we have...


For systems programming the correct way is to have explicit annotations so you can tell the compiler things like:

    void foo(void *a, void *b, int n) {
        assume_aligned(a, 16);
        assume_stride(a, 16);
        assume_distinct(a, b);
        ... go and vectorize!
    }


LOL, nope. Those annotations must be part of the type system (e.g. `&mut T` in Rust) and must be checked by the compiler (the borrow checker). The language can provide escape hatches like `unsafe`, but they should be rarely used. Without it you get a fragile footgunny mess.

Just look at the utter failure of `restrict`. It was so rarely used in C that it took several years of constant nagging from Rust developers to iron out various bugs in compilers caused by it.


Does make me wonder what restrict-related bugs will be (have been?) uncovered in GCC, if any. Or whether the GCC devs saw what LLVM went through and decided to try to address any issues preemptively.


IIRC at least one of the `restrict` bugs found by Rust was reproduced on both LLVM and GCC.


gcc has had restrict for 25 years I think. I would hope most bugs have been squashed by now.


Possibly? LLVM had been around for a while as well but Rust still ended up running into aliasing-related optimizer bugs.

Now that I think about it some more, perhaps gfortran might be a differentiating factor? Not familiar enough with Fortran to guess as to how much it would exercise aliasing-related optimizations, though.


I think Fortran function arguments are assumed not to alias. I'm not sure if it matches C restrict semantics though.


Yeah, that's why I was wondering whether GCC might have shaken out its aliasing bugs. Sibling seems to recall otherwise, though.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: