LLVMdev Digest, Vol 76, Issue 43

The above logic makes sense when you're talking about non-volatile loads
and stores. To me, it doesn't make sense for volatile loads and stores.

The whole point of volatile is to tell the compiler that its assumptions
about how memory works doesn't apply to this load or store and it
should, therefore, leave it alone and not do any optimization.
Informally, it's the programmer's way of telling the compiler, "I know
what I'm doing and you don't, so don't touch that memory operation."

What you and Duncan are saying is that volatile is volatile except when
it isn't. I think that's poor design. At the very least, it is
confusing, and at worst, it prevents LLVM from handling C's "volatile"
keyword correctly.

If it's decided that the current behavior is what LLVM will do, it
should at least be documented in the LLVM Language Reference Manual.
Right now, the current behavior directly contradicts the reference
manual, and that is definitely confusing.

-- John T.

John,
            On the one hand "volatile" it is to inform the compiler that either the "memory" being
referenced isn't really simple memory rather it is for example a device register with side-effects,
or that it is simple memory but holds a thread-shared global variable. On the other hand "volatile"
means don't delete, duplicate, reorder, or otherwise optimize references to this memory
location. But the latter is merely a convenient way to implement the necessary restrictions
required by the former, so who is to say that the implementation must dictate the definition ?

The documentation is simply telling you what a lazy implementation might do, not what
a system is required to do.

Your objection is like saying that it isn't kosher to ignore the "register" keyword, because
"I know what I am doing and you don't.....".

-Peter Lawrence.

What isn't kosher, is making side effects disappear from a C program.

I'm not going to bother quoting C99 5.1.2.3/2,3 verbatim, as we already have Word of God (Duncan Sands) that the C99 (and C90) standard's requirements on volatile are intentionally being violated.

Kenneth

There is no inherent reason llvm "volatile" has to have to the same semantics as C "volatile" just because they're spelled the same. For another case, see sqrt.

That said, I'm strongly on the side of those who think removing volatile loads is a bad idea. But the last time this came up, I wasn't able to construct a case where this resulted in externally visible incorrect behavior. Can anyone?

Your objection is like saying that it isn't kosher to ignore the
"register" keyword, because
"I know what I am doing and you don't.....".

What isn't kosher, is making side effects disappear from a C program.

I'm not going to bother quoting C99 5.1.2.3/2,3 verbatim, as we already
have Word of God (Duncan Sands) that the C99 (and C90) standard's
requirements on volatile are intentionally being violated.

There is no inherent reason llvm "volatile" has to have to the same semantics as C "volatile" just because they're spelled the same.

Right, the thread strictly is about LLVM IR rather than C or C++.

What I don't see (given what Duncan Sands mentioned), is how LLVM's optimizers *can* be trusted to preserve C or C++ volatile semantics.

....

That said, I'm strongly on the side of those who think removing volatile loads is a bad idea. But the last time this came up, I wasn't able to construct a case where this resulted in externally visible incorrect behavior. Can anyone?

Well...strictly as LLVM IR I find externally visible incorrect behavior unlikely, it's just a "different definition". For C and C++, I'd be looking at more complicated variations of

int main()
{
    volatile int i = 1;
    return 0;
}

It's clear that the LLVM IR representation of i cannot be simply IR-volatile qualified, as that load gets optimized out while C and C++ won't optimize it out. I'd *hope* that DragonEgg and llvm-gcc both leave the load of i in, when in --pedantic mode. [That is, I expect it to take something more intricate than this elementary test case to trigger any bugs here.]

Kenneth

Hi Kenneth,

Well...strictly as LLVM IR I find externally visible incorrect behavior
unlikely, it's just a "different definition". For C and C++, I'd be
looking at more complicated variations of

int main()
{
     volatile int i = 1;
     return 0;
}

It's clear that the LLVM IR representation of i cannot be simply
IR-volatile qualified, as that load gets optimized out while C and C++
won't optimize it out. I'd *hope* that DragonEgg and llvm-gcc both
leave the load of i in, when in --pedantic mode. [That is, I expect it
to take something more intricate than this elementary test case to
trigger any bugs here.]

both dragonegg and llvm-gcc remove the volatile. I don't see why they
shouldn't, since the program behaves exactly the same (as far as anyone
external can tell) as if it had been left there.

Ciao,

Duncan.

What if its address is taken, or it is a global variable?

--Edwin

Hi Torok,

Well...strictly as LLVM IR I find externally visible incorrect
behavior unlikely, it's just a "different definition". For C and
C++, I'd be looking at more complicated variations of

int main()
{
      volatile int i = 1;
      return 0;
}

It's clear that the LLVM IR representation of i cannot be simply
IR-volatile qualified, as that load gets optimized out while C and
C++ won't optimize it out. I'd *hope* that DragonEgg and llvm-gcc
both leave the load of i in, when in --pedantic mode. [That is, I
expect it to take something more intricate than this elementary
test case to trigger any bugs here.]

both dragonegg and llvm-gcc remove the volatile. I don't see why they
shouldn't, since the program behaves exactly the same (as far as
anyone external can tell) as if it had been left there.

What if its address is taken, or it is a global variable?

the volatile is only removed if it can be proved that doing so is harmless.
It is not removed in the cases you mention.

Ciao,

Duncan.

Hi Torok,

Well...strictly as LLVM IR I find externally visible incorrect
behavior unlikely, it's just a "different definition". For C and
C++, I'd be looking at more complicated variations of

int main()
{
       volatile int i = 1;
       return 0;
}

It's clear that the LLVM IR representation of i cannot be simply
IR-volatile qualified, as that load gets optimized out while C and
C++ won't optimize it out. I'd *hope* that DragonEgg and llvm-gcc
both leave the load of i in, when in --pedantic mode. [That is, I
expect it to take something more intricate than this elementary
test case to trigger any bugs here.]

both dragonegg and llvm-gcc remove the volatile. I don't see why they
shouldn't, since the program behaves exactly the same (as far as
anyone external can tell) as if it had been left there.

What if its address is taken, or it is a global variable?

the volatile is only removed if it can be proved that doing so is harmless.
It is not removed in the cases you mention.

How do you know that removing it is harmless? Volatile is sometimes used when the compiler's ability to determine what is safe won't work.

Consider an operating system. I write some code that does the following:

a) Load registers into some LLVM IR registers (using inline asm)
b) Modify the registers slightly
c) Use volatile stores to save the registers into a stack-allocated area
d) Context switch
e) Use volatile loads to reload the registers from the stack-allocated area

In this case, the volatiles are necessary; otherwise, the compiler (which knows nothing about the context switch code) may elide the stores to the stack memory (thinking it can keep state in processor registers), resulting in incorrect behavior.

Now, while I've based this off of the Linux 2.4 kernel's context switching code, I'll grant that this is a bit contrived; context switching code is usually written in assembly, and so the volatile loads and stores can be coded in the assembly code portion. However, this example should work, too, but your optimizations might break it (depending on what they do with the assembly code that does the actual context switch).

There may be more realistic examples I can think of. For example, perhaps you are debugging code and just want to mark something volatile temporarily so that it gets written to memory and is easier to find. Perhaps someone is writing some crazy multi-threaded thing that scans the stack of another thread by just knowing where the stacks of each thread are at. Maybe I just want something to be volatile to study the speed difference between optimizing and not optimizing code. The point is that volatile should mean volatile; the loads and stores shouldn't go away. If we could trust the compiler to figure out when the loads/stores could be optimized away, we wouldn't have to mark them volatile.

Regarding someone else's comment that the volatile keyword is like the register keyword, I don't believe they're the same. I believe register is defined as a hint whereas volatile is supposed to be respected by the compiler (language lawyers: feel free to correct me if I'm wrong).

In my opinion, I think you guys are really overthinking this: volatile should be volatile should be volatile. It makes the behavior of volatile easy to understand, it makes it easy to use, it allows LLVM to support the rules for the volatile keyword in C (AFAIK), and it doesn't require you to guess all the different, contorted ways in which volatile could be used.

-- John T.

John Criswell writes:

In my opinion, I think you guys are really overthinking this: volatile
should be volatile should be volatile. It makes the behavior of
volatile easy to understand, it makes it easy to use, it allows LLVM to
support the rules for the volatile keyword in C (AFAIK), and it doesn't
require you to guess all the different, contorted ways in which
volatile
could be used.

Absolutely. Another place we see volatile being used on locals is
timing loops:

  { volatile int i;
    for (i = 0; i < 10; ++i) {}
  }

Optimizing away the loop on the basis that nothing observable is
happening so it's "as if" it never happened, is counter to programmers'
intuitions about what 'volatile' means. The savings from optimizing
volatile are negligible. The risk of breaking working code in
difficult-to-debug ways is huge.

Al

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Hi John,

In my opinion, I think you guys are really overthinking this: volatile should be
volatile should be volatile. It makes the behavior of volatile easy to
understand, it makes it easy to use, it allows LLVM to support the rules for the
volatile keyword in C (AFAIK), and it doesn't require you to guess all the
different, contorted ways in which volatile could be used.

you will be pleased to know that the latest LLVM doesn't remove "volatile" from
your example, so it looks like you are not the only one who thinks this :slight_smile:

Ciao,

Duncan.