GCC should warn about 2^16 and 2^32 and 2^64

ChrisRR · on June 17, 2019

For those who aren't C/C++ programmers, the issue here is people using ^ to mean power where it means XOR in C/C++.

fefe23 · on June 17, 2019

Even if it meant power, it would make sense to warn here, because storing 2^16 in an unsigned short yields 0 :-)

garmaine · on June 17, 2019

Yeah my own first reaction was “Oh, gcc doesn’t warn about overflow assignment? That’s a definite bug.”

I code in C/C++ too, although not exclusively. I facepalmed a few seconds later, but it goes to show the value of having this be a compiler warning.

cfstras · on June 17, 2019

these are signed, so you‘re getting -2^16... so many warnings to emit!

/edit: had that mixed up in my head. Interestingly, clang does warn about the overflow:

    main.c:6:17: warning: implicit conversion from 'int'
      to 'int16_t' (aka 'short') changes value from
      65536 to 0 [-Wconstant-conversion]
  int16_t s16 = 65536;

enedil · on June 17, 2019

Oops, you meant -2^15 I guess.

NikkiA · on June 17, 2019

No, they meant -2^16, ie, -(0)

majewsky · on June 17, 2019

-2^16 is not -0. The unary minus binds more tightly than the binary xor, so this comes out as

  -2^16 = (-2) ^ 16 = 0b111...1110 ^ 0b10000 = 0b111...11011110 = -18

if I'm not mistaken.

NikkiA · on June 17, 2019

No, we mean the power operator, 2 to the power of 16 would be '0' in a short int, since the value overflows...

But since philosophically a signed int has '1's in every bit from the MSB to infinity towards the left, it should thus be -0 not 0.

leereeves · on June 17, 2019

In two's complement (the usual binary representation of signed ints), -0 is the same as 0.

quickthrower2 · on June 17, 2019

I am not a C programmer but my opinion with any language is don’t try to catch mistakes like this in the compiler. Instead leave to configurable and competing linters for people who want to catch possible mistakes like this. Anyone caught by this mistake probably hasn’t tested their program adequately. Any mistake induced by the wrong constant number should be easy to weed out with unit testing.

DownGoat · on June 17, 2019

Not including a linter in the earlier C compilers are one of the hughe mistakes of that area. Compilers absolutely should have linters, they are very valuable tools. Think about all the common security mistakes in your average C application that could have been avoided over the years with a linter.

I'm pretty sure Dennis Ritchie at some point himself said that not including a linter was a mistake, but I cannot find a quote of this with a quick search.

pjmlp · on June 17, 2019

He at least said that it was a requirement for the buggy code he saw being written.

"To encourage people to pay more attention to the official language rules, to detect legal but suspicious constructions, and to help find interface mismatches undetectable with simple mechanisms for separate compilation, Steve Johnson adapted his pcc compiler to produce lint [Johnson 79b], which scanned a set of files and remarked on dubious constructions."

-- https://www.bell-labs.com/usr/dmr/www/chist.html

Crinus · on June 17, 2019

Back in the day most people actually ignored the few warnings compilers would give them, i do not think that having a linter would be used much (and AFAIK there were some).

gumby · on June 17, 2019

"Back in the day?" Check out your system's log file for a scary collection of horror shows. Or compile most packages these days.

In defense: configuring a package that is compiled by lots of people on lots of machines with lots of versions of various compilers requires a lot of attention to warning flags and such. If you aren't starting out a package from a blank buffer it's very hard to get this right.

But I am frequently shocked by the number of compiler warnings I get from code downloaded from public repos and compiled in precisely the environments it's been documented to have been tested on.

scatters · on June 17, 2019

By the time you write a linter, you've done more than half of the work of a compiler, at which point you might as well just ship it as a compiler. It's a bit different for clang, which is modular and provides libraries that tool writers can use, but gcc is intentionally monolithic.

gchadwick · on June 17, 2019

> By the time you write a linter, you've done more than half of the work of a compiler, at which point you might as well just ship it as a compiler.

Does depend on what kind of lint you want to catch but something like this 'just' needs some lexing and parsing to give a usable AST that's reasonably clean so it's not too hard to write lint rules against it. (Not trivial for C & C++ hence 'just')

Building the full compiler is much more work, especially if you want a decent optimizing compiler.

antisemiotic · on June 17, 2019

It's not just nontrivial for C++, you need a full template engine just to properly parse it [1]. Same thing for languages with a lisp-like macro system. Or languages that allow dynamic redefinition od symbols [2].

[1] http://blog.reverberate.org/2013/08/parsing-c-is-literally-u...

[2] https://www.perlmonks.org/?node_id=663393

barrkel · on June 17, 2019

The back end is a lot of work, but so is the front end in a language like C++.

In fact most of the essential complexity is in the front end; you make do with a simple back end, but the front end is irreducibly complex if you want it to actually compile the language to spec.

And you need the front end to lint, especially if you want to avoid false positives, the bane of linting tools.

zaarn · on June 17, 2019

And better hope you didn't put in any bugs that GCC does or doesn't have, then you're left helpless as linter and compiler engage in epic battle on who is right.

simion314 · on June 17, 2019

Unit tests will not catch all issues, the person that writes the unit test will probably not see a corner case when implementing the code so it won't add the test for that corner case in the unit test either.

dspillett · on June 17, 2019

Or if the same person is writing the test as the code it is testing, given this is a syntax error rather than a logic error, it is likely that the test will be flawed similarly.

A test passing just means that the code and the test agree, not necessarily that both are correct.

pjmlp · on June 17, 2019

lint was developed in 1979 (https://www.bell-labs.com/usr/dmr/www/chist.html) and we had to wait until clang for many developers start caring about actually using static analyzers.

Every external tool is a lost opportunity for quality enforcement.

inlined · on June 17, 2019

Static analyzers existed long before the clang compiler. SAL from Microsoft is used extensively in every codebase I’ve worked with in Windows and it’s amazing, even if you have to get used to macros on all your functions’ parameters.

https://docs.microsoft.com/en-us/visualstudio/code-quality/u...

pjmlp · on June 17, 2019

Surely, and I appreciate its existence.

Just making the point that they have been largely ignored.

geezerjay · on June 17, 2019

That is not a fair assertion. You might have ignored static analyzers until you've discovered clang, but that observation is not generalizable. As others have already pointed out, lint has been used extensively for decades.

pjmlp · on June 17, 2019

I have been using static analysers since the mid-90's.

Ever heard of PC Lint from Gimpel? It was my first one.

Or do you think I would be critic of how many use C, without adopting best practices myself?

One learns a lot about code quality, or lack thereof, when doing enterprise consulting.

naikrovek · on June 18, 2019

> One learns a lot about code quality, or lack thereof, when doing enterprise consulting.

Pure truth, there. One has seen neither really awful code nor really great code until they have seen the wide array of code quality produced by an enterprise with a few decades of software development history. Any enterprise will suffice, it doesn't have to be a huge corporate entity.

There are a very few developers who are really good and who really care, and there are very few who really do not care and who are really not good, and most are somewhere in between.

pjmlp · on June 18, 2019

As addendum to my reply, here are the answer to JetBrains questionnaire about the C++ eco-system among their customers.

Check "Which of the following tools do you or your team use for guideline enforcement or other code quality/analysis?" results.

https://www.jetbrains.com/lp/devecosystem-2019/cpp/

aidenn0 · on June 17, 2019

GNU's C linter is GCC warnings. For C, making a linter isn't a big deal, but for C++ writing a robust C++ parser is so hard that the linter will need to use the compiler's parser anyways.

archi42 · on June 17, 2019

By that logic you could get rid of a lot of warnings. TBH, I think getting good warnings is a major feature of modern compilers.

NikkiA · on June 17, 2019

The kind of developers who are likely to make this mistake are probably not the kind of developers who will use advanced tools including linters or likely doing unit testing.

Some may even have released code that still has these time bombs waiting to occur.

AKrumbach · on June 17, 2019

Well, that isn't the only bug in the code, just the one which is being requested for prompting the warning. It's been about a decade since I've written significant C/C++ code, but something far more dangerous stood out to me:

Upon first reading, I'd assume they are seeing the maximum unsigned values, but their variables aren't declared as unsigned -- and C/C++ default to signed values. Secondly, representing the nth power of two actually takes n+1 binary digits, so e.g. 28 is a one, followed by 8 zeros -- overflowing any 8-bit variable, signed or unsigned! This is undefined behavior as-declared (signed values), but would be entered as a zero for explicitly declared unsigned variables.

Personally, if I were to be implementing this, I'd use bitwise operations to generate the correct values, and explicitly acknowledge in a comment something like "the following code knowingly breaks the abstraction of numeric values. It is likely not portable to other hardware, nor should you assume easy extension to other variable types. Das Quellcode ist nicht fur der gefingerpoken!"

dehrmann · on June 17, 2019

And Java, Python, Javascript, Ruby. The list goes on. It's more of a general problem for non-programmers than specifically people who are new to C (unlike the j++ in python someone mentioned).

phaedrus · on June 17, 2019

I am a C++ programmer and yet I also failed to catch on that ^ in the ticket was xor not pow until I read the comments. All it took was being a bit sleepy.

A rather unfortunate operator choice on the part of C: open to confusion and precluded adding ^ as an exponentiation operator to the language.

garethrees · on June 17, 2019

This seems like a good idea to me, since there's evidence that programmers are making this mistake, and the proposal is to add a warning only for the expressions that are most likely to be mistakes, that is 2^{decimal integer literal} and maybe also 10^{decimal integer literal}. There are many constructs that are well-defined in C, but where it is helpful to have warnings: use of = in condition expressions, implicit fallthrough in switch statements, misleading indentation, unsafe multiple statement macros, and so on. Programming in C is hard enough that every bit of compiler support is valuable.

mrfredward · on June 17, 2019

The power vs xor mistake is no doubt a common one, but I take issue with the 2^8, 2^16, 2^32 examples that got singled out because they are also common bitmasks. Should FLAG_BIT1 ^ FLAG_BIT3 really be a warning?

XOR may be rarely used, but 2^8 is not only one of the most common accidental XORs, it's also probably one of the more common legitimate uses of it.

fanf2 · on June 17, 2019

Flag bits should be combined with | not ^

inlined · on June 17, 2019

you do all your binary arithmetic with “or”, not “and” or “xor”? XOR is certainly less common than the other two but had its place.

tln · on June 17, 2019

> Should FLAG_BIT1 ^ FLAG_BIT3 really be a warning?

Nope, the warning would be for literal numbers in the source. IMO anyone who doesn't define constants for the flag bits deserves the warning.

mrfredward · on June 17, 2019

So 2^8 is a warning, but 2^BITS_IN_BYTE is not? I don't think whether or not the preprocessor helped in making the expression is a good heuristic for whether or not it is a mistake.

garethrees · on June 17, 2019

A warning heuristic needs to have a low false positive rate; a low false negative rate is nice but is not necessary. The purpose of a warning is to detect some common errors without inconveniencing too many correct programs. If some other errors go undetected then that is a shame but at least it is no worse than the current situation.

NotPaidToPost · on June 17, 2019

How do you do that?

The preprocessor does macro expansion, the compiler compiles the result.

The compiler does not see FLAG_BIT1 ^ FLAG_BIT3, it only sees the result, e.g. 2^32.

Therefore to catch only explicit 2^32 the warning should come from the preprocessor... Nice mess created right there.

sherincall · on June 17, 2019

Modern compilers do have insight into the source before it was preprocessed. You'll notice that clang and (modern) gcc will show you where the error is in the unexpanded source line, and then show the expansion of the macros. So when it detects "2^32", it can look back to see if it was a product of macro expansion or if the literal was directly written, and warn accordingly.

Interestingly, msvc can also do this, by virtue of not even having a distinct preprocessor phase at all.

deathanatos · on June 17, 2019

I recently found several instances of hand-crafted for loops in our Python code with:

  j = 0
  while condition():
    ++j

In Python, there is no "increment" operator, so `++j` is parsed as two unary plus operators: it's a no-op.

dagw · on June 17, 2019

I recently found the following bug in some python code at work

  x = 12 //10

What they where apparently trying to do was to change the value of x while commenting out the old value for reference. Except in python // is integer division and not comments so x was set to 1 instead of 12.

alexeiz · on June 17, 2019

This is silly. One of my personal stupid mistakes in Python is forgetting to delete a trailing comma. The resulting code is still valid, because a trailing comma produces a tuple.

    a = b,

dagw · on June 17, 2019

Or accidentally forgetting a comma like in

  a = ["long", "list" "of", "strings]

which become

  a = ["long", "listof", "strings"]

due to automatic string concatenation

guitarbill · on June 17, 2019

the bugbear plugin for flake8 can warn about this: https://pypi.org/project/flake8-bugbear/

nialv7 · on June 17, 2019

This is an example of why you shouldn't use blanket warning options like -Wall in combination with -Werror. Otherwise your program might fail to compile just because the dev randomly thought of some new things to warn about.

archi42 · on June 17, 2019

We use -Wall and -Wextra, and fixed basically all our warnings. Now everyone writes virtually warning-free code (that is, any warning is fixed before committing). And I don't feel limited in what I can do.

We could roll -Werror for most packages (I think we have a few packages, e.g. 3rd party libs, which emit warnings), but only use error=format and error=format-extra-args.

To be fair, we disable these warnings (-Wno-...): unknown-pragmas, long-long, register, unused-command-line-argument.

seba_dos1 · on June 17, 2019

I just keep my codebase warning free and check it in CI. No need for -Werror on package level, users don't need to deal with random breakage from a false-positive warning.

archi42 · on June 17, 2019

If you don't control the build env (e.g. FOSS work), then of course relying on -Werror might yield unnecessary trouble.

marcosdumay · on June 17, 2019

You never completely control your build env. You are always relying on 3rd parties for some extent.

That may not look relevant in short periods of time, but there are large odds that somebody will be biten by your choice of using -Werror 5 or 10 years down the road.

archi42 · on June 21, 2019

I would not sign that claim as a general truth. It's true under some assumptions, but then, well, just disable -Werror in 5 to 10 years? It's not like THAT is a breaking change.

OTOH 5 to 10 years is a good way to accumulate plenty of warnings (assuming a lack of development discipline), which at some point makes it easy to miss that one critical warning that should really be fixed. Maybe I should mention the worst case for an error in the code I write is "people die" (static analysis of safety critical stuff), so that's a good motivation to have no warnings ;-)

TorKlingberg · on June 17, 2019

-Wall, despite its name, isn't actually all warnings. Compiler devs only rarely put new warnings into -Wall.

floatboth · on June 17, 2019

They do it often enough that OS package maintainers deal with build failures caused by Werror quite a lot.

And forget new versions of compilers, how many things are only tested with GCC and fail to build with Werror with clang because of different warnings in Wall?

inlined · on June 17, 2019

Is there info on how many of those new compiler errors were true bugs vs spurious?

lalaithion · on June 17, 2019

It doesn't matter to the end user! If you have a bug in your code that can be detected by a new compiler warning, that's cool, but it shouldn't prevent your users from building their own version of your project.

pm215 · on June 17, 2019

I like the intermediate position: enable -Werror for development builds (which for QEMU means "builds from head-of-git"), leave things as just warnings for (source) releases, and give the user an escape hatch (eg a configure --disable-werror option) in case they need it. Enforcing warning-free compiles for the average developer is really useful for maintaining a minimum quality bar, so it's good to find a way to keep that without annoying people who might be compiling from source but are essentially 'end-users' of it.

marcosdumay · on June 17, 2019

You don't need -Werror for enforcing warning-free compilation.

adrianN · on June 17, 2019

You should always thoroughly check whether your program still works after upgrading the compiler. Adding a -Wno-bad-exponentation to your makefiles shouldn't be that difficult.

nialv7 · on June 17, 2019

If you are the only one using your program, then that's probably fine. I cannot count how many times I have to patch other people's makefile so their code won't fail to compile because of some -Werror introduced in a newer version of the compiler.

inlined · on June 17, 2019

I wonder if a script could be made that: 1. Adds pragmas around each error to suppress 2. Creates commit 3. Sends to upstream branch for review & triage 4. Bonus if hosted on GitHub, files an issue

typopl · on June 17, 2019

My perl script to scan C/C++ source code for common programming typos can check for this particular case since 1999.

see https://www.cpan.org/authors/id/T/TY/TYPO/typo-2.46.pl

  // START: Mon Jun 17 14:49:53 2019
  C:\src\test\test.c (1): using ^ on a number 37: short#maxshort=2^16;
  C:\src\test\test.c (2): using ^ on a number 37: int#maxint=2^32;
  C:\src\test\test.c (3): using ^ on a number 37: long#maxlong=2^64;

Link to the 1999 Perl Conference paper for the script:

https://web.archive.org/web/20090729104734/http://www.geocit...

TorKlingberg · on June 17, 2019

It would be nice, but I can see it's not easy to come up with a warning heuristic that doesn't give a bunch of false positives. Sure, 2^16 is "obviously" wrong, but how about SOMETHING ^ XOR_KEY with #define SOMETHING and #define XOR_KEY 16?

DaiPlusPlus · on June 17, 2019

They discuss that in the thread. Consensus seems to be to warn when seeing “2 ^ (certain base-10 literals)”.

It’s uncommon to use base 10 when bit-twiddling with the XOR operator - so this makes sense.

Of course, it would be better if C-family languages didn’t use ^ for XOR in the first place.

simias · on June 17, 2019

>Of course, it would be better if C-family languages didn’t use ^ for XOR in the first place.

I think at the time they were created there simply wasn't an ambiguity in anybody's mind. POW is a relatively complex operation, I don't think anybody expected it as a builtin operator (much like SQRT and friends). Arguably so is DIV but I suppose that's too convenient not to have.

XOR on the other hand is something you really want to be a builtin operator in a low level, bit-twiddling language like C, so getting a dedicated operator makes sense. I'm not sure if an other sigil would've made more sense, after all ASCII lacks ⊕.

beezle · on June 18, 2019

Algol 68 and Fortran both had . In fact, COBOL 60 even supported so not sure why anyone would not expect it as builtin for C.

dehrmann · on June 17, 2019

> I think at the time they were created there simply wasn't an ambiguity in anybody's mind.

And why would anyone in their right mind do 23 and not 1<<3? At least among the crowd who knows pow is expensive and what xor is.

dataflow · on June 17, 2019

Do they not currently/do they plan to track macro expansion well enough to distinguish between macros and literals? I feel like they should be able to get the best of both worlds here.

billpg · on June 17, 2019

If we're using our time machines to fix C, can we make a special type that's identical to unsigned int but can use all the bit-wise operators, requiring an explicit conversion operators between regular int and bitwise int.

bitwise x = 0x1234; x |= 0x1000;

(0x indicates a bitwise value.)

colejohnson66 · on June 17, 2019

I wouldn’t use 0x as that is used commonly to refer to hexadecimal values. Unless that was your intent and those are hexadecimal

billpg · on June 19, 2019

That was indeed my intention. Remember we're hypothetically reinventing C from scratch so there's no legacy code to worry about.

By making 0x literals of bitwise type, they can only be used with bitwise operators. Decimal literals are regular integers so can only use arithmetic operators.

If you want to use a hexadecimal value with arithmetic operators or decimal values with bitwise operators, use the operator that converts one type to another.

Jyaif · on June 17, 2019

With clang you can write linters that get the AST of the code in input.

https://clang.llvm.org/extra/clang-tidy/

cryptonector · on June 17, 2019

From the HN submission title it wasn't obvious that the issue is people misunderstanding that the ^ operator isn't the exponential operator but the XOR operator. Clicking through made it clear.

Warning about constant subexpressions where the LHS and RHS of ^ are constant literals makes a lot of sense to me, but if the LHS is a constant expression but not a constant literal, then not so much. Also, warning about this will cause some false positives where macro expansions are involved -- probably not a good thing.

ainar-g · on June 17, 2019

Dominik Honnef's staticcheck tool for the Go programming language will probably warn about these as well[1].

[1]: https://github.com/dominikh/go-tools/issues/516.

oytis · on June 17, 2019

What if I try to make a pipe with '|' or redirect the result with '>'? Should there be a warning for that?

I find it amusing that people really do that though.

sambe · on June 17, 2019

Some people switch programming languages often. Exponentiation operators are common, both caret and double star are used. It does not seem amusing (surprising?) to me.

On the other hand, piping and redirection are relatively uncommon outside of the shell. There is certainly a clearer mental distinction for me between "I am in shell" and "I am not in shell" than "I should double star".

oytis · on June 17, 2019

C doesn't have double star either (in fact double star means double dereferencing).

YMMV or course, but distinction between a low-level language like C, and a higher-level one that can support exponentiation, matrix multiplication, thread spawning etc. as a first-class language construct is as clear to me as the difference between shell and non-shell.

aidenn0 · on June 17, 2019

Someone correct me if I'm wrong, but I believe double star as an infix operator would be interpreted as multiplication of a dereference:

  X ** Y

would be parsed as

  X * (*Y)

bodyfour · on June 17, 2019

...but it's unlikely to be accidentally used as an exponent, since you can't dereference a number.

jandrese · on June 17, 2019

At least in that case the compiler should produce a warning about trying to deference the not-pointer.

derefr · on June 17, 2019

C has an “exponentiation operator”, used in literals where you’d expect exponentiation in other languages: <<. (Sure, it’s limited to raising things to powers of 2, but when would you ever need to raise something to a non-power-of-2? :P)

maxlybbert · on June 17, 2019

The warning amounts to “only someone who knows what they’re doing would write XXX, and we believe you don’t know what you’re doing.”

kazinator · on June 17, 2019

GCC should understand the expression grammars of no fewer than a dozen other languages and warn you when something could syntactically be an expression from another language, but with a different meaning.

GCC should also change your diaper, burp you, and issue random diagnostics like

  foo.c:32: warning: did you test this yet?

kazinator · on June 17, 2019

GCC has gotten too bloated. It could benefit from compile time CONFIG_ options to remove all the fluff, such as superfluous diagnostics.

If for whatever reason I have to include GCC in an embedded this sort of "misleading indentation" and "suggest parentheses" nonsense just wastes precious flash space.

NotPaidToPost · on June 17, 2019

I don't think it would not be simple or a good idea to warn because these are perfectly valid expressions and it would lead to a flood of warnings if compilers started to display "Did you really mean that?" for an ever growing list of expressions (which is what it would become).

Better to leave this to code analysis tools.

Edit:

2^32 uses the ^ operator exactly as intended, and has perfectly legitimate uses, especially when dealing with registers of HW peripherals.

It's not the same as warning on "if (a=b)" as someone replied.

From a language perspective, 2^32 is exactly the same as 2+32.

rocqua · on June 17, 2019

It's a warning not an error. Moreover, it is very hard to come up with a legitimate usecase of 2^32 as an expression. People who use it to set bit flags will use &, and for bit flags, octal or hex notation is probably better.

raxxorrax · on June 17, 2019

I don't know if it is a good practice, but I often have statements and expressions that look awfully similar. Flipping bits does happen often enough.

Mostly written as y ^ (1 << x), which could easily resolve to these expressions. Mostly at runtime, sure, but there are exceptions. Especially if you like descriptive constants. I would expect it to be quite difficult to separate the "correct cases" without people starting to just suppress compiler warnings or trick it with writing the same stuff in different words (which might be better).

On the other hand, setting max_short to something like 18 is probably a really god prank in larger programming environments. The compiler would just ruin all the fun here.

derefr · on June 17, 2019

> Mostly written as y ^ (1 << x), which could easily resolve to these expressions.

We’re talking about a parse-time check; things that resolve to those values after identifier binding won’t throw the warning. It’s not a warning about what you wrote (semantically), it’s a warning about how you wrote it (syntactically).

rocqua · on June 17, 2019

The point is very specifically cases where X is a decimal literal (not an expression such as 1 << x, or a hex number).

Things that resolve to this don't trigger the warning, and it is a compile time warning.

And then specifically looking at 2^X and 10^X occurring in source code.

raxxorrax · on June 17, 2019

I would agree with the suggestion then. It can happen that you just write it out like that and the compiler should maybe inform me. But why limit it to 2^ and 10^ and not say literal^literal should always produce a warning? Any common or valid usages that I am missing here?

rocqua · on June 21, 2019

Certainly, I'd say if either side is a hex or octal literal, it might be someone intentionally bit-twideling.

As for limiting the base, that is a more fluid matter. I guess you want to avoid warnings for very large numbers. Because those are more likely legitimate.

NotPaidToPost · on June 17, 2019

> But why limit it to 2^ and 10^ and not say literal^literal should always produce a warning?

You're suggesting that using the C language as specified and intended should produce a warning... That makes no sense.

pmikesell · on June 17, 2019

Yes, or 1 << 31, for example.

rocqua · on June 17, 2019

That has plenty of usecases.

NotPaidToPost · on June 17, 2019

> It's a warning not an error.

Many build environments are set to treat all warnings as errors.

The point is that 2^32 is a perfectly compliant C expression that is neither misleading nor ambiguous, and that also won't create any variable overflow. It uses ^ exactly as intended. Why should the compiler complain? Why should I get a warning/error when following the spec to the letter?

lifthrasiir · on June 17, 2019

Like it or not, there are already tons of lint-like warnings in GCC [1] and clang [2]. Many (uh, most?) of them are not required by the C/C++ standards (which by the way have their own requirements for diagnostic messages).

[1] https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html

[2] https://clang.llvm.org/docs/DiagnosticsReference.html

barrkel · on June 17, 2019

Plenty of valid expressions generate warnings. A frequent gotcha is `=` in boolean contexts. Warnings are justified, and can normally be avoided with a little bit more lexical work (e.g. an extra set of parentheses). The upside is large and the downside is miniscule.

OskarS · on June 17, 2019

Isn't that pretty much the definition of a warning? An expression that is technically valid, but probably a mistake. If something wasn't a valid expression, it would be a compiler error, not a warning.

mikeash · on June 17, 2019

Usually, yes. C is weird because undefined behavior is illegal, can sometimes be detected statically, and yet will not produce an error. So sometimes you get warnings for invalid yet error-free code. It’s a strange language.

_gok2 · on June 17, 2019

Environments where the decision has been made that error-prone, misleading and ambiguous statements should not be allowed typically are those where warning are treated as errors.

A new type of misleading and error-prone statement has been found, so it seems entirely reasonable that this is added to that list.

This seems to be the entire intent of compiler warnings (these days anyway)

sayusasugi · on June 17, 2019

Heuristics like this are important because humans are not machines. If you run into a warning in one of these situations, they are usually rectified by changing the statement slightly. Worst case scenario you can #pragma push and pop the warning.

pjc50 · on June 17, 2019

The compiler should complain because it's overwhelmingly likely to be a mistake.

viraptor · on June 17, 2019

You can turn specific warnings off. You can also usually disable them inline. There are very common ways to deal with false positive warnings.

masklinn · on June 17, 2019

> It's not the same as warning on "if (a=b)" as someone replied.

"if (a=b)" uses the "=" operator exactly as intended with perfectly legitimate uses. It's warned about because it's an error-prone construct, not because it's incorrect.

Seems to me 2^32 is also error-prone, if you wanted to combine flags you'd use 2|32, using xor here is weird.

NotPaidToPost · on June 17, 2019

'If' expects a condition so finding an assignment may be a red flag even if an assignment has a return value in the spec, which makes the construct valid.

On the other hand ^ expects integers so 2^32 is exactly what is expected.

Most the replies I saw here try to second-guess or claim that ^ should be used in a specific way. Not so, ^ is just doing XOR of two integers.

Apparently, I am having an incorrect opinion, though, so I will self-censor and remain silent.

CJefferson · on June 17, 2019

You don't have to enable warnings, but most c programmers do.

The problem is, what proportion of 2^32 in C are correct? I'm will to bet it is as close to 0 as doesn't matter. The gcc devs aren't stupid, before enabling a warning like this, they will run a test compiling a substasal chunk of debian, and see if there are any false positives.

mikeash · on June 17, 2019

If expects an integer, and = produces an integer when the lhs has integer type.

blattimwind · on June 17, 2019

> I don't think it would not be simple or a good idea to warn because these are perfectly valid expressions and it would lead to a flood of warnings if compilers started to display "Did you really mean that?" for an ever growing list of expressions (which is what it would become).

There are many technically valid expressions and statements that will still generate warnings in most compilers.

if(a=b) {} comes to mind first and foremost.

estebank · on June 17, 2019

I'm intrigued at what your opinion is about the following Rust errors:

    error[E0308]: mismatched types
     --> src/main.rs:4:8
      |
    4 |     if x = y { println!("eq") }
      |        ^^^^^
      |        |
      |        expected bool, found ()
      |        help: try comparing for equality: `x == y`
      |
      = note: expected type `bool`
                 found type `()`

    error: `<` is interpreted as a start of generic arguments for `u32`, not a comparison
     --> src/main.rs:4:17
      |
    4 |     if i as u32 < 2 {
      |        -------- ^ --- interpreted as generic arguments
      |        |        |
      |        |        not interpreted as comparison
      |        help: try comparing the cast value: `(i as u32)`

__david__ · on June 17, 2019

For the same reason that this:

    if (c)
        printf("Hello");
        exit(0);
    exit(1);

Produces a warning in gcc:

    test.c:7:5: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
         if (c)
         ^~
    test.c:9:9: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
             exit(0);
             ^~~~

"2^32" is of course just an xor, but it makes no sense (the "exclusive" part of it is not being exercised). If you really mean 2 bits being set in a constant then "2 | 32" is much clearer. If you're dead set on xor, "32^2" won't trigger the warning.

NotPaidToPost · on June 17, 2019

It's not for you or the compiler to decide that 2^32 makes no sense. It is using ^ exactly as intended and is neither misleading nor ambiguous.

Stating that 32^2 should not trigger a warning while 2^32 should shows that this proposal has not been thought through.

It's not desirable or sensible to raise a warning on the premise that the expression might mean something else in another language, which is what this would do.

__david__ · on June 21, 2019

I feel perfectly qualified to say that 2^32 makes no sense.

Just because something is specced doesn't mean that it's reasonable. My "if statement" example is perfectly valid C, but it's not reasonable. 2^32 is perfectly valid, but also unreasonable. Frankly, any bitwise operations with >10 decimal constants are either intentionally obfuscating, or done by someone who doesn't know what they are doing. Consider "2^32" vs "0x02^0x20". The latter is much better.

It is entirely sensible to warn on the premise that it might mean something in another language: The construct seems to be an actual point of confusion (the authors sited numerous real code examples where people are accidentally doing this in the wild), and the construct makes no sense at face value. I think you'd be hard pressed to find a real example of 2^32 in code where it isn't a bug. That's enough to make it a reasonable warning.

antirez · on June 17, 2019

Terrible idea. If a programmer is so unfamiliar with C/C++ to believe ^ is pow(), she/he will create a lot of other problems regardless. Without to mention the kind of bugs created by such incorrect usage would be simple to spot most of the cases.

OskarS · on June 17, 2019

Wait... so your point is that since "bad programmers" do this, there shouldn't be a warning for it? That is among the silliest thing I ever heard. Especially since there's a trivial way to get rid of the warning: just write "18" instead of "2^16", if you REALLY intended the xor.

Also: it's incredibly naive to think that "bad programmers" are the only ones that would make a mistake like this. I'm very familiar with all C/C++ operators and know perfectly well that the caret represents xor, but I could easily imagine myself slipping and writing 2^16 instead of 1 << 16, because that's how the rest of the world writes exponentials.

Let he/she who has never written a typo and had the compiler save them cast the first stone.

dataflow · on June 17, 2019

> Wait... so your point is that since "bad programmers" do this, there shouldn't be a warning for it? That is among the silliest thing I ever heard.

In all honesty while I don't agree with this particular instance, the reasoning isn't ridiculous. If we assume it's true that it'd only affect bad programmers, then you probably wouldn't want to add it, since it'd increase the false positive risk and/or otherwise potentially get in the way of everyone else. (Kind of like how you wouldn't make cars start honking when they're turned on as a means of protecting against bad drivers, even thought that might well save lives.) Some tools just need to assume some base level of expertise to be effective.

mikeash · on June 17, 2019

You snuck in a second premise to the argument: that it will increase the false positive risk. And if we accept that premise, we don’t need the other one.

Bad programmers are the ones who need the most help from our tools. Saying there’s no need for a warning because only bad programmers will benefit is like saying there’s no need for crossing guards because only stupid children get killed crossing the street.

estebank · on June 17, 2019

I think you just helped me understand where the objection comes from. The people against this warning seem to feel infantilized by these type of warnings, much in the way you're comparing it to crossing guards for children. My view, and I guess the view of people arguing for these kind of warnings see them like seatbelts: a mild annoyance that is very much worth it.

mikeash · on June 17, 2019

Sounds about right. I don’t really get it myself. As a professional C and C++ programmer, every mistake is punished severely by the computer and I want all the help I can get in avoiding them. Warnings aren’t annoying, what’s annoying are one-in-a-thousand crashes that only happen on someone else’s computer.

ken · on June 17, 2019

I would say this seems more like airbags: a safety feature which requires no work on the user's part, 99.999% of the time, and which you might not even know exists up until the point when it jumps into action to save you.

OskarS · on June 17, 2019

If you want to make the argument "this is a bad warning because it will cause many false positives", then that's a perfectly valid argument to make. It doesn't really apply in this case, but it's a valid argument to make against a warning being added.

That's not the reasoning given. The reasoning given is that since only bad programmers would make this mistake, it shouldn't be a warning at all, since those kinds of programmers "will create a lot of other problems regardless". That is nonsensical. You could make the same argument against ANY warning. The whole point of a warning is to alert the developers that, while their code technically follows the rules of the language, they've probably made a mistake.

antirez · on June 17, 2019

Let's way that warnings are very important, they cause noise, so such noise should be deserved for instances that are really common mistakes among the population of programmers that have some clue. There are many errors like the one above that can be made by clueless programmers for which you can't emit a warning at all, like misusing postfixed/prefixed increment operators or alike.

ken · on June 17, 2019

The vast majority of people writing C that I've seen in the past 20 years are occasional C programmers. We write in higher-level languages, and occasionally drop down into C to troubleshoot a library we're using. I haven't been a full-time C programmer since late last century. I see C code these days and think "huh, so that's valid in C now".

I get that you are a full-time C programmer, but unless you're volunteering to take on all my C troubleshooting work, this seems almost punitive. Like building a balcony with no railing, because nobody would just walk off a ledge -- and yet, sadly, professionals still die from falls.

To say that all such people are "clueless", regardless of skill or training or experience, is to fall back on the old test pilot mentality: the good survive, therefore, if you didn't survive, you (retroactively) must not have had The Right Stuff after all.

Joel Spolsky explained the reasoning more eloquently than I could:

"Now, even without going through with this experiment, I can state with some confidence that some of the users will simply fail to complete the task, or will take an extraordinary amount of time doing it. I don’t mean to say that these users are stupid. Quite the contrary, they are probably highly intelligent, or maybe they are accomplished athletes, but vis-à-vis your program, they are just not applying all of their motor skills and brain cells to the usage of your program. You’re only getting about 30% of their attention, so you have to make do with a user who, from inside the computer, does not appear to be playing with a full deck."

I trust myself and the people I work with, but I still wear a hardhat when they're working overhead, and I tie all my tools to my belt when I'm working at height. It's the clueless people who don't learn from the mistakes of the past.

q3k · on June 17, 2019

Let me clarify - you're saying that just because we can't catch all programming mistakes means we shouldn't bother catching any, even though this is a pretty solid example of one that would be worthwhile and easy to do?

antirez · on June 17, 2019

Nope, I think that to catch trivial errors like that is mostly unhelpful, what bothers me about that is that could emit a warning in perfectly ok code, for instance as a result of a post-processing that maps symbols to literal values. Given that 3^5 will never be the result of some programing genuine mistake, but means not knowing the language, I think it is worse to create issues to people that know what they are doing (emitting a useless warning), than trying to advice people not knowing the basic of the language. Anyway such mistake will be realized very soon by the inexperienced programmer.

simias · on June 17, 2019

At first it was my knee-jerk reaction as well but it's bit dumb frankly. I write some python semi-regularly and even though I'm familiar with its syntax I commonly end up putting ; at the end of statements (which generally doesn't do anything for better or worse) and use && and || instead of 'and' and 'or', mainly out of muscle memory.

I can imagine a casual C coder could easily end up using ^ as "pow" for very much the same reason if they're used to it working that way in other languages. On top of that the proposed warning has extremely low chances of triggering on a false positive on actually legit code. I think it's a good proposal.

>Without to mention the kind of bugs created by such incorrect usage would be simple to spot most of the cases.

That's not obvious to me at all. Bogus bitfield values could be rather tricky to track down, especially since they might work correctly "by chance" in simple cases (for instance even if the value of the constant is completely wrong, as long as it's != 0 you can set it with |= and test it with & and it'll appear to work mostly correctly).

barrkel · on June 17, 2019

I think you're being unrealistic about development in a team with mixed experience levels. These things slip through. We can stop them automatically, without needing to rely on a manually written test, or on being spotted in a code review. It's all upside from where I'm standing.

q3k · on June 17, 2019

Experienced people make mistakes, and that's okay. I've made mistakes like this, or worse. I want my tools to help me write good code, not assume that I'm an infallible machine.

We need to kill the myth that good programmers don't need help from their tools.

raverbashing · on June 17, 2019

Incremental learning and guidance is important. If the compiler can point out something that's technically correct but in practice wrong (because 2^16 is the weirdest way of writing that value) it will help some people

dvhh · on June 17, 2019

My take is that for those who are trying to be as pedantic as using -Werror flags would have a lot of fun putting up pragmas to remove the the false positive.

bluefox · on June 17, 2019

I agree with you, but this thread is full of people thinking it's a good idea because they or their colleagues are prone to that kind of silliness. For people who write C or C++ every day it's stupid compiler second-guessing, and it can also create issues for generated code, but this thread shows that our tools may need adjustment to better deal with today's level of incompetence.

colejohnson66 · on June 17, 2019

Why would you write 2 xor 16 in instead of just 18? And if it’s generated code, wrap it in a pragma to disable the warning

bluefox · on June 17, 2019

[flagged]

dang · on June 17, 2019

> Why would you beat up your wife instead of expressing your anger using words?

Not cool. Please don't post like this to HN again.

https://news.ycombinator.com/newsguidelines.html

bluefox · on June 17, 2019

https://en.wikipedia.org/wiki/Complex_question

Also, the excuse might be the same in the two questions: brute force may be less demanding than particular calculation.