Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks for doing more research. Did you get the idea there are two separate issues at hand -- 1) compiler optimizes memory access away because to its logic it seems the value is not written anywhere in that process. Say it could use a register instead if memory read. This is the effect I was seeing. 2) Compiler inserts the memory read instruction, but depending on the processor, at runtime, some of the writes will not show up at all, or be reorded. In this case running program on different CPU might make it appear to work correctly. I think there is a slight difference between the two from reading about back then. Can the first issue be fixed with a memory barrier?


Yes, you are correct that there are two issues. But solving 2 may solve 1 automatically or just need the compiler barrier intrinsics in the same spots as the processor barrier intrinsics.

And yes, the first issue can be fixed with memory barriers. If you put a compiler+processor read barrier in front of the first line that reads the shared variable the compiler will re-read it from memory back into a register. After that it will use the register as to not kill performance until it hits the barrier again.


So you think wrapping every single access to that shared memory structure in barriers would do the trick? I should dig out that code and try, I am curious now. But that would be kind of an large change. Now the volatile modifier is in one place only -- where data is defined not when it is accessed.


If you do it correctly it should work. I'm curious if that code is running on a multi core (or even hyperthreaded) CPU.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: