> Volatile only affects how the compiler handles memory access My comment specif...

Avernar · on April 11, 2016

Yes, I read your comment. :) What I was saying is that when you're using shared memory to talk to another thread or process (which by definition is running on another thread) you need to use memory barriers or the processor may do the same optimizations (and more) that your compiler did. On top of that other cores may not see your read/writes.

Debug mode tends to always work because it writes code as if everything was effectively volatile. After that it depends on how the threads/processes are scheduled on what core. Without proper synchronization it may work, may work slowly, may not work once in a blue moon or fail all the time. Current processor may run it fine and then it may stop working on future processors.

Fortunatly if you use things like mutexes then the barriers are handled for you.

Depends on the compiler if the barier intrinsics need volatile. I'll have to do some research on that.

loeg · on April 11, 2016

Yep, any useful compiler support for fence/lock intrinsics will prevent optimizing out those intrinsics.

rdtsc · on April 11, 2016

Are memory barriers transactional i.e. something like begin barrier(), do some shm writes, end_barrier?

I was declaring data structures in shared mmap-ed memory as volatile so compiler does not optimise it away. How would the compiler know not to optimise away operation between memory fences? A fence to me say something about the guarantee of the order in which operations are seen. The order wasn't the problem, write never making it to the other side was the issue...

Avernar · on April 11, 2016

There are many types of barriers depending on the processor architecture. They're usually used in pairs so for example a read barrier would be used when mutex is acquired and the a write barrier when it's released.

If the compiler is smart about its memory barrier intrinsics it will make sure to reread anything after the read barrier and write everything before the write barrier. This will speed things up by not having to use volatile. Between the barriers it can cache reads and inside registers and optimize out multiple writes like normal.

If the compiler can't do that then you need to use volatile. Volatile was meant for I/O where every read has to hit the bus. So if you're forced to use it then you just copy the volatile variable into a non volatile one for use inside your mutex to get some speed back.

loeg · on April 11, 2016

You do need volatile (or equivalent) to instruct the compiler not to optimize away an access/write. Locked read/write intrinsics will already be annotated such that they don't get optimized.