bug or expected behaviour?

Carl_Norum1 · June 4, 2013, 10:55pm

I've been chasing down a bug for a few days in some RTOS code I'm building with clang. In the end it comes down to this code:

  void external(void);
  void test(char *x)
  {
      x--;
      if (!x)
      {
          external();
      }
  }

When compiled for ARM with TOT clang (r183249) at -O1 or greater, no calls to 'external()' appear in the output code at all. I have an example project I can send out if anyone's interested, but it's just the above source file and this makefile:

.PHONY: all optimized unoptimized clean

all: optimized unoptimized
grep external *.s

optimized:
./clang -S -O1 example.c -o example-optimized.s

unoptimized:
./clang -S -O0 example.c -o example-unoptimized.s

clean:
rm -f *.s

When run, I get this output:

  $ make
  ./clang -S -O1 example.c -o example-optimized.s
  ./clang -S -O0 example.c -o example-unoptimized.s
  grep external *.s
  example-unoptimized.s: bl external

As you can see, 'example-optimized.s' contains no references to the 'external()' function. Am I missing something subtle here standard-wise? Or is this an optimizer bug? If it's a bug, what's the right component to file it in?

I've attached the complete assembly output (both example-optimized.s and example-unoptimized.s) in case that's helpful. Thanks for any insight!

Oh - almost forgot - version information from the compiler:

  $ ./clang --version
  clang version 3.4 (trunk 183249)
  Target: arm-none--eabi
  Thread model: posix

example-optimized.s (273 Bytes)

example-unoptimized.s (479 Bytes)

Tyler_Hardin · June 4, 2013, 11:23pm

If this were a problem with an omitted statement involving a normal variable, I’d guess you’re missing a volatile qualifier. I’m not 100% sure volatile is a valid qualifier for functions, but try it.
If RTOS stands for real time OS, then reading up on volatile would be a really good idea.

P.S. Sorry Carl, you’re going to receive this twice. I forget to CC the list.

Carl_Norum1 · June 4, 2013, 11:33pm

If this were a problem with an omitted statement involving a normal variable, I'd guess you're missing a volatile qualifier. I'm not 100% sure volatile is a valid qualifier for functions, but try it.

Well, yes, if I change the signature to:

void test(char * volatile x)

It works, but that's because I'm hamstringing the optimizers. I don't really see how that has anything to do with the question, though. If I change the signature to:

void test(int x)

It works too... what's special about 'char *'?

If RTOS stands for real time OS, then reading up on volatile would be a really good idea.

I'm familiar with 'volatile' semantics, thanks.

P.S. Sorry Carl, you're going to receive this twice. I forget to CC the list.

No problem.

-- Carl

Tyler_Hardin · June 4, 2013, 11:42pm

I was suggesting to add it to the function, like
volatile void func(…);
Theoretically, this would tell the compiler not to omit seemingly superfluous calls to func.

If this were a problem with an omitted statement involving a normal variable, I’d guess you’re missing a volatile qualifier. I’m not 100% sure volatile is a valid qualifier for functions, but try it.

Well, yes, if I change the signature to:

void test(char * volatile x)

It works, but that’s because I’m hamstringing the optimizers. I don’t really see how that has anything to do with the question, though. If I change the signature to:

void test(int x)

It works too… what’s special about ‘char *’?

If RTOS stands for real time OS, then reading up on volatile would be a really good idea.

I’m familiar with ‘volatile’ semantics, thanks.

P.S. Sorry Carl, you’re going to receive this twice. I forget to CC the list.

No problem.

– Carl

Carl_Norum1 · June 4, 2013, 11:50pm

'volatile' can't apply to a function, so I'm not sure what you mean. In your example, 'volatile' modifies the 'void'. So I think "theoretically", it doesn't do anything at all.

-- Carl

Tyler_Hardin · June 5, 2013, 12:20am

I tried the “extern” specifier, which (I guess) you should use if the definition isn’t in the file; and it worked with -O3.

Input code:
extern void external(void);
void test(char *x)
{
x–;
if (!x)
{
external();
}
}

Command:
clang -S -o volatile-test.s -O3 test.c

Output attached.

The only line of interest is probably:
jmp external # TAILCALL

extern-test.s (326 Bytes)

Carl_Norum1 · June 5, 2013, 12:28am

I tried the "extern" specifier, which (I guess) you should use if the definition isn't in the file; and it worked with -O3.

That 'extern' doesn't do anything - it's implicit. Did you try without it and get different results? In my test here with the Mac OS X dev tools version of clang, I got the expected results, both for x86 and for ARM. I guess it's possible it's ARM related. What version of clang/llvm did you use for your test?

-- Carl

David_Mirabito · June 5, 2013, 12:31am

Hi Carl,

I don't know much about the specifics of any given optimisation, but if you do something like the following, you can at least see which optimisation pass is responsible.
At the very least perhaps that can point to the next place to investigate. (also interesting to note Apple's shipped clang doesn't exhibit this behaviour, I had to use my own recent svn tree)

$ clang -arch arm -emit-llvm -S -O0 example.c -o arm.bc
$ opt -print-after-all -O1 arm.bc

I can see where the branch test is whittled down to an 'if true' and then eliminated, but cannot comment as to why this might be.

Cheers,
-DavidM

jcranmer · June 5, 2013, 3:55am

From reading C11, I can posit a potential spec explanation:

Start with x--. Per C11:
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.

The optimizer can therefore conclude that if this program has well-defined behavior, then x can never point to the null pointer constant (since the null pointer constant is not part of any array object). As a result, the "if (!x)" branch would never trigger, and is dead code.

So this doesn't look like an invalid optimization per C11, but that doesn't mean that it's necessarily desired behavior in the optimizer.

Carl_Norum1 · June 5, 2013, 5:02am

Nope - rebuilding clang/llvm to get intel support shows the same behaviour with both '-arch i386' and '-arch x86_64', too.

-- Carl

Carl_Norum1 · June 5, 2013, 5:10am

Hi Carl,

I don't know much about the specifics of any given optimisation, but if you do something like the following, you can at least see which optimisation pass is responsible.

Thanks for the example; it's going to take me a while to figure out what the output really means, though!

At the very least perhaps that can point to the next place to investigate. (also interesting to note Apple's shipped clang doesn't exhibit this behaviour, I had to use my own recent svn tree)

Yeah, I noticed that as well, so it's definitely a behaviour change.

$ clang -arch arm -emit-llvm -S -O0 example.c -o arm.bc
$ opt -print-after-all -O1 arm.bc

I can see where the branch test is whittled down to an 'if true' and then eliminated, but cannot comment as to why this might be.

I think I see that too. I'll have to stare at that output some more to see if it yields any insight.

-- Carl

Carl_Norum1 · June 5, 2013, 5:19am

Thanks Joshua,

A bit of spec reading made me think something like that was going on. In the land of embedded systems, zero is often a valid pointer, though, so being able to avoid this kind of optimization would be nice. That said, if that's the way clang is going to be, it's certainly possible to work around.

In this specific case, the OS vendor is doing something (IMHO) terrible - aliasing a 'char *' field of a queue structure via a #define statement to act as a recursive mutex counter. Something like:

    #define recursive_mutex_counter some_field_not_used_by_a_mutex
    struct queue
    {
       char *some_field_not_used_by_a_mutex;
    };

    void mutexReleaseRecursive(struct queue *q)
    {
        q->recursive_mutex_counter--;

        if (q->recursive_mutex_counter == 0)
            queue_push_back(q, some_dummy_variable);
    }

Which, as I'm sure we can all agree here is heinous. It's easily worked around without changing the vendor's funny-business overloading:

q->recursive_mutex_counter = (char *)((uint32_t)q->recursive_mutex_counter - 1);

And I can send them a patch to that effect. I guess what I'm looking for is either for clang to make it possible to avoid this kind of optimization, to change the optimizer so it doesn't happen in the first place, or for someone on the list to give me a canonical answer that clang is going to continue behaving this way in the future, so I have something to point to when the OS guys try to push back (I've found other standard-violating behaviour that they've declined to fix).

Any input or suggestions from the list are welcome and appreciated!

regehr · June 5, 2013, 2:50pm

The optimizer can therefore conclude that if this program has well-defined behavior, then x can never point to the null pointer constant (since the null pointer constant is not part of any array object). As a result, the "if (!x)" branch would never trigger, and is dead code.

This is correct: in C you can't create a null pointer by decrementing a valid pointer. The code in question is dangerous and wrong, and needs to be reviewed to look for other similar problems.

John

Carl_Norum1 · June 5, 2013, 4:55pm

OK, cool, thanks.

Why no warning or static analyzer error? Other "this comparison is always true" or "this comparison is always false" warnings exist, right?

-- Carl

zygoloid · June 5, 2013, 7:18pm

There’s no warning in the frontend, because this is not “locally” obvious – we would need (very simple, in this case) reasoning about control flow to see this, and we don’t have many such checks, because they’re significantly more expensive than local checks.

There’s no warning from the middle-end optimizers which remove the check, because we don’t like the optimizer to produce warnings (they have stability issues and generally give a very poor diagnostic experience).

The static analyzer would be a good place to warn on this, but presumably it’s just not been taught to issue this warning yet.

Carl_Norum1 · June 5, 2013, 8:43pm

If anyone cares to back me up, I filed a bug and submitted a patch to the OS in question, found here:

FreeRTOS Real Time Kernel (RTOS) / Bugs / #66 Fix pointer arithmetic misbehaviour in queue.c

-- Carl

Topic		Replies	Views
[Clang][OpenCL] Bug in handling non-extern inline function definition when optimization is disabled? Clang Frontend	1	230	May 16, 2023
Clang -O0 performs optimizations that undermine dynamic bug-finding tools LLVM Dev List Archives	4	60	March 24, 2017
[bug][clang] with optimization -O1 will get wrong result Code Generation clang , llvm	5	359	August 14, 2023
Trying to figure out where a front-end optimization occurs Clang Frontend	1	59	January 24, 2017
Possible bug in optimizer LLVM Dev List Archives	1	66	April 29, 2018

bug or expected behaviour?

Related Topics