Overzealousness of of -Wformat causes problems for LLDB, others

If you build LLDB on Linux, you get many many warnings from Clang following this pattern:

/whereever/lldb/llvm/tools/lldb/source/Symbol/ClangASTType.cpp:1012:31: warning:
      conversion specifies type 'long long' but the argument has type 'int64_t'
      (aka 'long') [-Wformat]
                s->Printf("%lli", enum_value);
                           ~~~^ ~~~~~~~~~~
                           %ld

/whereever/lldb/llvm/tools/lldb/source/Target/ThreadPlan.cpp:142:103: warning:
      conversion specifies type 'unsigned long long' but the argument has type
      'uint64_t' (aka 'unsigned long') [-Wformat]
  ...#%u: tid = 0x%4.4llx, pc = 0x%8.8llx, sp = 0x%8.8llx, fp = 0x%8.8llx, "
                                                                  ~~~~~~^
                                                                  %8.8lu

/whereever/lldb/llvm/tools/lldb/tools/driver/Driver.cpp:915:85: warning:
      conversion specifies type 'unsigned long long' but the argument has type
      'lldb::pid_t' (aka 'unsigned long') [-Wformat]
  ...(message, sizeof(message), "Process %llu %s\n", process.GetProcessID(),
                                         ~~~^ ~~~~~~~~~~~~~~~~~~~~~~
                                         %lu

For x86_64 code OS X, and Linux on pid_t, int64_t and uint64_t are typedefs to 64-bit quantities. Exactly what those typedefs are is up to the platform though -- for example, possible definitions for int64_t include:

      typedef long int64_t // Used by Linux
  or typedef long long int64_t // Used by OS X
  or typedef intmax_t int64_t // Would also work on Linux and OS X
  or typedef ssize_t int64_t // Would also work on Linux and OS X
  or typedef ptrdiff_t int64_t // Would also work on Linux and OS X
  or typedef quad_t int64_t // Would also work on Linux and OS X

All these types are basically the same, they're 64-bit signed integers. But each one has its own length modifier for printf (i.e., l, ll, j, z, t, and q, respectively).

Even though they're structurally identical, Clang's format string checks care (somewhat) about which length modifier is chosen; Clang (usually!?!) warns when you pass a long long to a format that wants a long (or vice versa), because although they're structurally identical the types are considered distinct. GCC does this too.

If you know your C standard, you may say that portable code should really use the relevant stdint.h macro, and thus the first example should been written:

  s->Printf(PRId64, enum_value)

but first, who actually does this, and second, that still leaves the question of what to do about pid_t, since it has no such macro.

Interestingly, if we use "%zd" as our format, Clang is permissive about it, allowing us to pass in both longs and long longs without a complaint (is this a bug or a feature? is it documented somewhere? GCC isn't permissive in the same way...), but it isn't permissive about the others, including strangeness like allowing "%ld" to be used with intmax_t but not "%llu" (this is a bizarre choice because if long long was actually bigger than long, intmax_t would have to be long long).

It seems to me that we ought to have two kinds of warnings for format strings, one for things that are actually problems, and a -Wpedantic-format for things that are technically wrong, but are not actually problem for the platform we're compiling on, like using "%jd" to print a long long. Pedantic warnings might even include using "%ld" for the intmax_t type, because even if it is typedefed to long on this platform, it might not be on another platform and for that reason you should really be using "%jd".

Thoughts...? (I'm happy to file a bug, but it seemed like something that deserved some discussion as to what the desired behavior should be.)

    M.E.O.

P.S. For fun, I've enclosed some source annotated with the format warnings it produces on Linux and OS X with GCC and Clang (3.1 RC3).

Enc.

#define _BSD_SOURCE 1
#include <stdio.h>
#include <stdint.h>
#include <stddef.h>
#include <sys/types.h>

int main()
{
    printf("%ld", (long) 0);
    printf("%lld", (long) 0);
    // ^-- clang/OS X: specifies type 'long long' but the argument has type 'long'
    // ^-- gcc46/OS X: '%lld' expects argument of type 'long long int', but argument 2 has type 'long int'
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'long'
    // ^-- gcc47/Linux: '%lld' expects argument of type 'long long int', but argument 2 has type 'long int'
    printf("%jd", (long) 0);
    printf("%td", (long) 0);
    printf("%zd", (long) 0);
    printf("%qd", (long) 0);
    // ^-- clang/OS X: specifies type 'long long' but the argument has type 'long'
    // ^-- gcc46/OS X: '%qd' expects argument of type 'long long int', but argument 2 has type 'long int'
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'long'
    // ^-- gcc47/Linux: '%qd' expects argument of type 'long long int', but argument 2 has type 'long int'

    printf("%ld", (long long) 0);
    // ^-- clang/OS X: specifies type 'long' but the argument has type 'long long'
    // ^-- gcc46/OS X: '%ld' expects argument of type 'long int', but argument 2 has type 'long long int'
    // ^-- clang/Linux: specifies type 'long' but the argument has type 'long long'
    // ^-- gcc47/Linux: '%ld' expects argument of type 'long int', but argument 2 has type 'long long int'
    printf("%lld", (long long) 0);
    printf("%jd", (long long) 0);
    // ^-- clang/OS X: specifies type 'intmax_t' (aka 'long') but the argument has type 'long long'
    // ^-- gcc46/OS X: '%jd' expects argument of type 'intmax_t', but argument 2 has type 'long long int'
    // ^-- clang/Linux: specifies type 'intmax_t' (aka 'long') but the argument has type 'long long'
    // ^-- gcc47/Linux: '%jd' expects argument of type 'intmax_t', but argument 2 has type 'long long int'
    printf("%td", (long long) 0);
    // ^-- clang/OS X: specifies type 'ptrdiff_t' (aka 'long') but the argument has type 'long long'
    // ^-- gcc46/OS X: '%td' expects argument of type 'ptrdiff_t', but argument 2 has type 'long long int'
    // ^-- clang/Linux: specifies type 'ptrdiff_t' (aka 'long') but the argument has type 'long long'
    // ^-- gcc47/Linux: '%td' expects argument of type 'ptrdiff_t', but argument 2 has type 'long long int'
    printf("%zd", (long long) 0);
    // ^-- gcc46/OS X: '%zd' expects argument of type 'signed size_t', but argument 2 has type 'long long int'
    // ^-- gcc47/Linux: '%zd' expects argument of type 'signed size_t', but argument 2 has type 'long long int'
    printf("%qd", (long long) 0);

    printf("%ld", (intmax_t) 0);
    printf("%lld", (intmax_t) 0);
    // ^-- clang/OS X: specifies type 'long long' but the argument has type 'intmax_t' (aka 'long')
    // ^-- gcc46/OS X: '%lld' expects argument of type 'long long int', but argument 2 has type 'long int'
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'intmax_t' (aka 'long')
    // ^-- gcc47/Linux: '%lld' expects argument of type 'long long int', but argument 2 has type 'long int'
    printf("%jd", (intmax_t) 0);
    printf("%td", (intmax_t) 0);
    printf("%zd", (intmax_t) 0);
    printf("%qd", (intmax_t) 0);
    // ^-- clang/OS X: specifies type 'long long' but the argument has type 'intmax_t' (aka 'long')
    // ^-- gcc46/OS X: '%qd' expects argument of type 'long long int', but argument 2 has type 'long int'
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'intmax_t' (aka 'long')
    // ^-- gcc47/Linux: '%qd' expects argument of type 'long long int', but argument 2 has type 'long int'

    printf("%ld", (ssize_t) 0);
    printf("%lld", (ssize_t) 0);
    // ^-- clang/OS X: specifies type 'long long' but the argument has type 'ssize_t' (aka 'long')
    // ^-- gcc46/OS X: '%lld' expects argument of type 'long long int', but argument 2 has type 'long int'
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'ssize_t' (aka 'long')
    // ^-- gcc47/Linux: '%lld' expects argument of type 'long long int', but argument 2 has type 'long int'
    printf("%jd", (ssize_t) 0);
    printf("%td", (ssize_t) 0);
    printf("%zd", (ssize_t) 0);
    printf("%qd", (ssize_t) 0);
    // ^-- clang/OS X: specifies type 'long long' but the argument has type 'ssize_t' (aka 'long')
    // ^-- gcc46/OS X: '%qd' expects argument of type 'long long int', but argument 2 has type 'long int'
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'ssize_t' (aka 'long')
    // ^-- gcc47/Linux: '%qd' expects argument of type 'long long int', but argument 2 has type 'long int'

    printf("%ld", (ptrdiff_t) 0);
    printf("%lld", (ptrdiff_t) 0);
    // ^-- gcc46/OS X: '%lld' expects argument of type 'long long int', but argument 2 has type 'long int'
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'ptrdiff_t' (aka 'long')
    // ^-- gcc47/Linux: '%lld' expects argument of type 'long long int', but argument 2 has type 'long int'
    printf("%jd", (ptrdiff_t) 0);
    printf("%td", (ptrdiff_t) 0);
    printf("%zd", (ptrdiff_t) 0);
    printf("%qd", (ptrdiff_t) 0);
    // ^-- gcc46/OS X: '%qd' expects argument of type 'long long int', but argument 2 has type 'long int'
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'ptrdiff_t' (aka 'long')
    // ^-- gcc47/Linux: '%qd' expects argument of type 'long long int', but argument 2 has type 'long int'

    printf("%ld", (quad_t) 0);
    // ^-- clang/OS X: specifies type 'long' but the argument has type 'quad_t' (aka 'long long')
    // ^-- gcc46/OS X: '%ld' expects argument of type 'long int', but argument 2 has type 'long long int'
    printf("%lld", (quad_t) 0);
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'quad_t' (aka 'long')
    // ^-- gcc47/Linux: '%lld' expects argument of type 'long long int', but argument 2 has type 'long int'
    printf("%jd", (quad_t) 0);
    // ^-- clang/OS X: specifies type 'intmax_t' (aka 'long') but the argument has type 'quad_t' (aka 'long long')
    // ^-- gcc46/OS X: '%jd' expects argument of type 'intmax_t', but argument 2 has type 'long long int'
    printf("%td", (quad_t) 0);
    // ^-- clang/OS X: specifies type 'ptrdiff_t' (aka 'long') but the argument has type 'quad_t' (aka 'long long')
    // ^-- gcc46/OS X: '%td' expects argument of type 'ptrdiff_t', but argument 2 has type 'long long int'
    printf("%zd", (quad_t) 0);
    // ^-- gcc46/OS X: '%zd' expects argument of type 'signed size_t', but argument 2 has type 'long long int'
    printf("%qd", (quad_t) 0);
    // ^-- clang/Linux: specifies type 'long long' but the argument has type 'quad_t' (aka 'long')
    // ^-- gcc47/Linux: '%qd' expects argument of type 'long long int', but argument 2 has type 'long int'

    return 0;
}

If you build LLDB on Linux, you get many many warnings from Clang following this pattern:

/whereever/lldb/llvm/tools/lldb/source/Symbol/ClangASTType.cpp:1012:31: warning:
conversion specifies type 'long long' but the argument has type 'int64_t'
(aka 'long') [-Wformat]
s->Printf("%lli", enum_value);
~~~^ ~~~~~~~~~~
%ld

This was actually brought up on the #llvm IRC channel the other week.
I agree, it is a pain. :confused:

/whereever/lldb/llvm/tools/lldb/source/Target/ThreadPlan.cpp:142:103: warning:
conversion specifies type 'unsigned long long' but the argument has type
'uint64_t' (aka 'unsigned long') [-Wformat]
...#%u: tid = 0x%4.4llx, pc = 0x%8.8llx, sp = 0x%8.8llx, fp = 0x%8.8llx, "
~~~~~~^
%8.8lu

With a more recent Clang version, it will at least suggest "%8.8lx".

/whereever/lldb/llvm/tools/lldb/tools/driver/Driver.cpp:915:85: warning:
conversion specifies type 'unsigned long long' but the argument has type
'lldb::pid_t' (aka 'unsigned long') [-Wformat]
...(message, sizeof(message), "Process %llu %s\n", process.GetProcessID(),
~~~^ ~~~~~~~~~~~~~~~~~~~~~~
%lu

For x86_64 code OS X, and Linux on pid_t, int64_t and uint64_t are typedefs to 64-bit quantities. Exactly what those typedefs are is up to the platform though -- for example, possible definitions for int64_t include:

   typedef long      int64\_t       // Used by Linux

or typedef long long int64_t // Used by OS X
or typedef intmax_t int64_t // Would also work on Linux and OS X
or typedef ssize_t int64_t // Would also work on Linux and OS X
or typedef ptrdiff_t int64_t // Would also work on Linux and OS X
or typedef quad_t int64_t // Would also work on Linux and OS X

All these types are basically the same, they're 64-bit signed integers. But each one has its own length modifier for printf (i.e., l, ll, j, z, t, and q, respectively).

Even though they're structurally identical, Clang's format string checks care (somewhat) about which length modifier is chosen; Clang (usually!?!) warns when you pass a long long to a format that wants a long (or vice versa), because although they're structurally identical the types are considered distinct. GCC does this too.

If you know your C standard, you may say that portable code should really use the relevant stdint.h macro, and thus the first example should been written:

   s\-&gt;Printf\(PRId64, enum\_value\)

but first, who actually does this, and second, that still leaves the question of what to do about pid_t, since it has no such macro.

For pid_t, I guess a portable solution would be to cast it to intmax_t
and print that with "%jd".

Interestingly, if we use "%zd" as our format, Clang is permissive about it, allowing us to pass in both longs and long longs without a complaint (is this a bug or a feature? is it documented somewhere? GCC isn't permissive in the same way...),

What happens here is that for "%zu", Clang will expect the type that
size_t is typedefed to, which on my system is unsigned long. In C,
there is no built-in distinct type for size_t, but Clang does keep
track of which integer type it uses for sizes, i.e. the result of
sizeof(), etc.

We want the result of sizeof to be printable with "%zu", and therefore
don't strictly enforce that the type of the argument is a typedef with
name "size_t".

From what I can tell, GCC does this too, i.e. "printf("%zu\n",

sizeof(int))" doesn't yield a warning, even though the type of the
return value from sizeof isn't actually size_t, but unsigned long.

but it isn't permissive about the others, including strangeness like allowing "%ld" to be used with intmax_t but not "%llu" (this is a bizarre choice because if long long was actually bigger than long, intmax_t would have to be long long).

We're not pedantic enough to warn about using "%ld" with intmax_t on a
system where intmax_t is typedefed to long. We will warn when using
"%lld", because "long" and "long long" are distinct types even if
they're the same size.

It seems to me that we ought to have two kinds of warnings for format strings, one for things that are actually problems, and a -Wpedantic-format for things that are technically wrong, but are not actually problem for the platform we're compiling on, like using "%jd" to print a long long. Pedantic warnings might even include using "%ld" for the intmax_t type, because even if it is typedefed to long on this platform, it might not be on another platform and for that reason you should really be using "%jd".

We already have some format warnings under -pedantic that warn about
using non-ISO C features.

I agree that it would be nice to separate warning that are concerned
with portability from warnings about code that is broken on the target
machine. But I also think this could be pretty tricky.

Not sure how much this helps, but hopefully it at least explains the
situation a little.

Thanks,
Hans