core dump, how to best investigate

I have done a lot of coredump analysis, but I need to touch base to see if
there's any standard procedure on how to best do this within clang and llvm
context.

Partial backtrace:

#0 0x081ba401 in clang::Expr::getType (this=0x0) at Expr.h:37
#1 0x081d94d7 in clang::Sema::ParseUnaryOp (this=0x831b880, OpLoc=
      {ID = 2148130816}, Op=clang::tok::kw___extension__, input=0x0)
    at SemaExpr.cpp:1455
#2 0x081f7c40 in clang::Parser::ParseCastExpression (this=0x831b800,
    isUnaryExpression=false) at ParseExpr.cpp:546

So ParseUnaryOp() has an input which is empty and as such resultType =
Input->getType(); fails.

ParseCastExpression() is executing:
Res = Actions.ParseUnaryOp(SavedLoc, SavedKind, Res.Val);

which has to mean that Res.Val is empty:

(gdb) print Res.Val
$1 = (void *) 0x0

It seems to come from including unistd.h on FreeBSD, which includes
sys/types.h, which includes machine/endian.h which results in:

[15:37] [asmodai@nexus] (1125) {0} % clang /usr/include/machine/endian.h
/usr/include/machine/endian.h:146:10: warning: expression result unused
        return (__byte_swap_int(_x));
                ^~~~~~~~~~~~~~~
zsh: segmentation fault (core dumped) clang /usr/include/machine/endian.h

Relevant part of the source file:

#define __byte_swap_int_var(x) \
__extension__ ({ register __uint32_t __X = (x); \
   __asm ("bswap %0" : "+r" (__X)); \
   __X; })

And looking at Parse/ParseExpr.cpp line 542 this seems expected:

// FIXME: Extension not handled correctly here!

Would it make sense to add a fail-safe in the form of (excuse my C++ I'm
better in C and C#):

if (Input == NULL) {
  bail_out();
}
resultType = Input->getType();

#1 0x081d94d7 in clang::Sema::ParseUnaryOp (this=0x831b880, OpLoc=
     {ID = 2148130816}, Op=clang::tok::kw___extension__, input=0x0)
   at SemaExpr.cpp:1455
#2 0x081f7c40 in clang::Parser::ParseCastExpression (this=0x831b800,
   isUnaryExpression=false) at ParseExpr.cpp:546

So ParseUnaryOp() has an input which is empty and as such resultType =
Input->getType(); fails.

Yep, this is because the inner expression did not correctly return as AST node.

ParseCastExpression() is executing:
Res = Actions.ParseUnaryOp(SavedLoc, SavedKind, Res.Val);

It seems to come from including unistd.h on FreeBSD, which includes
sys/types.h, which includes machine/endian.h which results in:

Ah, it's bad this happens so frequently for you, ok.

[15:37] [asmodai@nexus] (1125) {0} % clang /usr/include/machine/endian.h
/usr/include/machine/endian.h:146:10: warning: expression result unused
        return (__byte_swap_int(_x));
                ^~~~~~~~~~~~~~~
zsh: segmentation fault (core dumped) clang /usr/include/machine/endian.h

Relevant part of the source file:

#define __byte_swap_int_var(x) \
__extension__ ({ register __uint32_t __X = (x); \
   __asm ("bswap %0" : "+r" (__X)); \
   __X; })

And looking at Parse/ParseExpr.cpp line 542 this seems expected:

// FIXME: Extension not handled correctly here!

__extension__ is actually handled correctly enough for this code. The FIXME refers to the fact that __extension__ doesn't currently silence extension-related diagnostics in subexpressions. For example, if you compile this:

typedef unsigned __uint32_t;

#define __byte_swap_int_var(x) \
__extension__ ({ register __uint32_t __X = (x); \
    __asm ("bswap %0" : "+r" (__X)); \
    __X; })

int test(int _x) {
    return (__byte_swap_int_var(_x));
}

with: clang ~/t.c -parse-ast-print -pedantic

You get:

/Users/sabre/t.c:5:4: warning: extension used
    __asm ("bswap %0" : "+r" (__X)); \
    ^
typedef unsigned int __uint32_t;
/Users/sabre/t.c:9:10: warning: use of GNU statement expression extension
return (__byte_swap_int_var(_x));
          ^
/Users/sabre/t.c:9:10: warning: extension used
return (__byte_swap_int_var(_x));
          ^
/Users/sabre/t.c:9:10: warning: expression result unused
return (__byte_swap_int_var(_x));
          ^~~~~~~~~~~~~~~~~~~
Bus error

Because those extension warnings are inside a __extension__ node, they should not be emitted.

The actual problem happening here is that we don't current build an AST node for the GNU statement expression extension. This is the TODO on line 864 of ParseExpr.cpp.

In the future, if you run into a problem, please include a self-contained testcase, like the example I provide above. This makes it much easier to understand what problem you're hitting.

In any case, I'll add the AST node and get the example working. Thanks!

-Chris

Yep, generally the idiom would be:

   assert(Input && "Expected an input AST node");

-Chris

typedef unsigned __uint32_t;

#define __byte_swap_int_var(x) \
__extension__ ({ register __uint32_t __X = (x); \
    __asm ("bswap %0" : "+r" (__X)); \
    __X; })

int test(int _x) {
    return (__byte_swap_int_var(_x));
}

In any case, I'll add the AST node and get the example working. Thanks!

Here ya go:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20070723/001371.html

This implements ast building and semantic analysis, but not codegen support. On the example above, you get:

$ clang test/Sema/stmt_exprs.c -parse-ast-print
typedef unsigned int __uint32_t;
test/Sema/stmt_exprs.c:11:10: warning: expression result unused
return (__byte_swap_int_var(_x));
         ^~~~~~~~~~~~~~~~~~~

int test(int _x) {
   return (__extension__({
     register __uint32_t __X = (_x);
     __X;
   }));
}

1 diagnostic generated.

There are two issues here:
   1. We don't currently build an AST node for GNU inline asm stmts, so the dump above doesn't include it.
   2. We emit an 'expression result unused' warning for the "__X;" statement at the end of the compound stmt, even though it is used by the stmt expr.

If you'd be interested in helping out with either of these, go for it :slight_smile:

-Chris