[RFC] NoBuiltin Attribute

Hi LLVMites!

This patch adds the 'nobuiltin' attribute to to LLVM. This is needed during LTO, which right now ignores this attribute and command line flag. I want to make this an IR-level attribute instead of a target-dependent attribute because it's used during IR modification and not code generation.

-bw

clang.nobuiltin.diff (587 Bytes)

nobuiltin.diff (7.31 KB)

Awesome work on the Attributes Bill!

I see them now in the bitcode files.

I can't wait continue my work that needs to compile functions differently as mips16 or mips32 depending on the attributes.

Thank you! Let me know if you would like some extra functionality or changes in them. :slight_smile:

-bw

Hi Bill,

I think the concept of this patch makes sense, but the implementation does not.

I have:

void foo() {
  printf("hello\n");
}

and I build with -fno-builtin-puts. If I understand correctly, *foo* will be marked with "nobuiltin", but this code in simplifylibcalls looks at the printf:

Value *LibCallSimplifier::optimizeCall(CallInst *CI) {
+ Function *F = CI->getCalledFunction();
+ if (F->hasFnAttribute(Attribute::NoBuiltin)) return 0;
   return Impl->optimizeCall(CI);
}

In the context of LTO, it makes sense for the attribute to be on function bodies, not on prototypes.

-Chris

Yeah, I noticed that after sending this patch. I modified it to check the function CI is in for that attribute. Once we have support for the `-fno-builtin-FUNCTION' flag, I expect the attribute to look something like this:

  "no-builtin-functions" = "puts,foo,bar"

-bw

I wonder... why not attach the no-builtin attribute to the call
instruction? It seems like that would avoid the need of a string attribute
altogether, having the frontend match up calls to specific functions which
should not be considered calls to builtin functions?

For the general case of just '-ffreestanding' or whichever, you might do
something different if annotating every call instruction is prohibitively
expensive.

But I've not thought about this deeply, I just find the idea of having the
backend be responsible for matching up the function names somewhat
distasteful.

What if the body is defined in some external library?

-Krzysztof

How does this work? If I build with -fno-builtin-fputs, why would an attribute be added to a call to printf?

-Chris

That code is presumably compiled by someone. If whoever compiles it specifies -fno-builtin, the attribute would be added to it. It doesn't affect its clients.

-Chris

Hi Bill,

I think the concept of this patch makes sense, but the implementation does not.

I have:

void foo() {
printf("hello\n");
}

and I build with -fno-builtin-puts. If I understand correctly, *foo* will be marked with "nobuiltin", but this code in simplifylibcalls looks at the printf:

Value *LibCallSimplifier::optimizeCall(CallInst *CI) {
+ Function *F = CI->getCalledFunction();
+ if (F->hasFnAttribute(Attribute::NoBuiltin)) return 0;
return Impl->optimizeCall(CI);
}

In the context of LTO, it makes sense for the attribute to be on function bodies, not on prototypes.

Yeah, I noticed that after sending this patch. I modified it to check the function CI is in for that attribute.

Was that in the patch you committed? What specific patch are you looking for?

Once we have support for the `-fno-builtin-FUNCTION' flag, I expect the attribute to look something like this:

  "no-builtin-functions" = "puts,foo,bar"

I guess this could work, this means that simplifylibcalls (and others) would check this instead of TargetLibraryInfo?

Out of curiosity, how important is -fno-builtin-foo? Isn't it enough to just have -fno-builtin-foo disable *all* builtin optimizations for the unit of code being compiled? This is overly conservative - is that actually a problem for something?

-Chris

I thought that no-builtin attached to foo means that foo should not be considered a builtin function. For example, if someone wrote their own printf, they may want to mark it as no-builtin so that the compiler doesn't assume it's the printf from libc.

What does it mean in LLVM?

-Krzysztof

Out of curiosity, how important is -fno-builtin-foo?

Depends, does that include alloca and the object size stuff?

Isn't it enough to just have -fno-builtin-foo disable *all* builtin
optimizations for the unit of code being compiled? This is overly
conservative - is that actually a problem for something?

Making -fno-builtin-foo imply -fno-builtin seems overly aggressively. I
can imagine wanting to use the former when implementing libc to get more
predictable behavior, but still wanting to get e.g. the string
optimisations.

As I mentioned on IRC, there is also the related issue of explicitly
requesting a builtin, i.e. a positive __attribute__((__builtin__)) for
freestanding code. I don't think that -fno-builtin-puts should imply
-fno-builtin-printf. It should apply to function calls in this module,
but may be merged by adding the nobuiltin attribute to the prototypes in
all other modules during LTO.

Joerg

Hi Bill,

I think the concept of this patch makes sense, but the implementation does not.

I have:

void foo() {
printf("hello\n");
}

and I build with -fno-builtin-puts. If I understand correctly, *foo* will be marked with "nobuiltin", but this code in simplifylibcalls looks at the printf:

Value *LibCallSimplifier::optimizeCall(CallInst *CI) {
+ Function *F = CI->getCalledFunction();
+ if (F->hasFnAttribute(Attribute::NoBuiltin)) return 0;
return Impl->optimizeCall(CI);
}

In the context of LTO, it makes sense for the attribute to be on function bodies, not on prototypes.

Yeah, I noticed that after sending this patch. I modified it to check the function CI is in for that attribute.

Was that in the patch you committed? What specific patch are you looking for?

Yeah, that was in the one that I committed. I basically want something like this:

  define void @foo() "no-builtin" {
    call void @printf()
  }

And then the `call' to printf would check for the "no-builtin" attribute on `@foo' to determine if it can convert it to `puts'. Having the attribute on the declaration of `printf' doesn't help us here. And I'm not sure if having it on the call would be beneficial either.

Once we have support for the `-fno-builtin-FUNCTION' flag, I expect the attribute to look something like this:

  "no-builtin-functions" = "puts,foo,bar"

I guess this could work, this means that simplifylibcalls (and others) would check this instead of TargetLibraryInfo?

Yes.

Out of curiosity, how important is -fno-builtin-foo? Isn't it enough to just have -fno-builtin-foo disable *all* builtin optimizations for the unit of code being compiled? This is overly conservative - is that actually a problem for something?

Being overly conservative will of course keep the correct semantics for the program. I suppose that there could be some people who have their own special version of a builtin that they want to use, but they don't want all builtin optimizations to be disabled.

-bw

That code is presumably compiled by someone. If whoever compiles it specifies -fno-builtin, the attribute would be added to it. It doesn't affect its clients.

I thought that no-builtin attached to foo means that foo should not be considered a builtin function. For example, if someone wrote their own printf, they may want to mark it as no-builtin so that the compiler doesn't assume it's the printf from libc.

This is the GCC definition of the `-fno-builtin' flags:

`-fno-builtin'
`-fno-builtin-FUNCTION'
     Don't recognize built-in functions that do not begin with
     `__builtin_' as prefix. *Note Other built-in functions provided
     by GCC: Other Builtins, for details of the functions affected,
     including those which are not built-in functions when `-ansi' or
     `-std' options for strict ISO C conformance are used because they
     do not have an ISO standard meaning.

     GCC normally generates special code to handle certain built-in
     functions more efficiently; for instance, calls to `alloca' may
     become single instructions that adjust the stack directly, and
     calls to `memcpy' may become inline copy loops. The resulting
     code is often both smaller and faster, but since the function
     calls no longer appear as such, you cannot set a breakpoint on
     those calls, nor can you change the behavior of the functions by
     linking with a different library. In addition, when a function is
     recognized as a built-in function, GCC may use information about
     that function to warn about problems with calls to that function,
     or to generate more efficient code, even if the resulting code
     still contains calls to that function. For example, warnings are
     given with `-Wformat' for bad calls to `printf', when `printf' is
     built in, and `strlen' is known not to modify global memory.

     With the `-fno-builtin-FUNCTION' option only the built-in function
     FUNCTION is disabled. FUNCTION must not begin with `__builtin_'.
     If a function is named this is not built-in in this version of
     GCC, this option is ignored. There is no corresponding
     `-fbuiltin-FUNCTION' option; if you wish to enable built-in
     functions selectively when using `-fno-builtin' or
     `-ffreestanding', you may define macros such as:

          #define abs(n) __builtin_abs ((n))
          #define strcpy(d, s) __builtin_strcpy ((d), (s))

What does it mean in LLVM?

I believe that it should have an analogous meaning.

-bw

Hi Chris,

And what I'm wondering is if you could could annotate two cases in the
frontend:

a) Mark a call to a function 'foo' as no-builtin to prevent exactly *that*
callsite from being transformed based on special knowledge of the function
being called.

b) Mark a function definition of 'foo' as not being subject to synthesized
calls to a library function.

With (a), you need an attribute on the call site, but you don't need to
know *which* builtin to suppress.

With (b), you need an attribute on the function definition, but I think
what Chris was suggesting was that *this* half of implementing -fno-builtin
could in fact be implemented without a specific list of non-builtins, which
I actually mostly agree with... I'm fairly OK turning off essentially all
of the libcall-synthesizing for any TU where we have calls to functions
that can't be reasoned about as builtin functions.

However, now that I think about it, I think there is an alternative to (b)
that I like better, and actually goes particularly well with (a): for any
'foo' in '-fno-builtin-foo', *introduce* a declaration of the function and
attach the 'no-builtin' attribute to it. If code goes on to define that
function, great, attach the 'no-builtin' attribute to that. Attach it to
all of the calls, etc.

Then, when we hit (2) above, the libcall simplification can look for an
existing declaration, and if that declaration has the attribute, we don't
synthesize it.

(I feel like this may have been Bill's original proposal, and we're
circling around now... apologies if so...)

After reading the description quoted by Bill, I'm not sure what you mean by attaching the attribute to the function body. Suppose I have my own version of strlen, written in whatever language and present in an .o file, and a function foo in C which calls strlen, which is compiled with clang. It makes sense to put the "nobuiltin" attribute on the prototype of "strlen", since this way all the callers of the strlen would know what they are dealing with.

-Krzysztof

Having the attribute on the prototype of printf would tell us that calls to printf cannot be converted to calls to anything else. Isn't that what you want?

If you want to prevent the optimizations on foo that take advantage of the knowledge about certain known functions, then we should use a different attribute for that, or else things can get really confusing.

-Krzysztof

In the context of LTO, it makes sense for the attribute to be on function bodies, not on prototypes.

Yeah, I noticed that after sending this patch. I modified it to check the function CI is in for that attribute.

Was that in the patch you committed? What specific patch are you looking for?

Yeah, that was in the one that I committed. I basically want something like this:

define void @foo() "no-builtin" {
   call void @printf()
}

And then the `call' to printf would check for the "no-builtin" attribute on `@foo' to determine if it can convert it to `puts'. Having the attribute on the declaration of `printf' doesn't help us here. And I'm not sure if having it on the call would be beneficial either.

Ok, this makes a lot of sense to me.

Once we have support for the `-fno-builtin-FUNCTION' flag, I expect the attribute to look something like this:

  "no-builtin-functions" = "puts,foo,bar"

I guess this could work, this means that simplifylibcalls (and others) would check this instead of TargetLibraryInfo?

Yes.

Sounds good, much of TLI can just disappear when this happens, woot.

Out of curiosity, how important is -fno-builtin-foo? Isn't it enough to just have -fno-builtin-foo disable *all* builtin optimizations for the unit of code being compiled? This is overly conservative - is that actually a problem for something?

Being overly conservative will of course keep the correct semantics for the program. I suppose that there could be some people who have their own special version of a builtin that they want to use, but they don't want all builtin optimizations to be disabled.

I guess I'm just asking "how much do we care" about that use case? Is it worth uglifying IR dumps?

-Chris

I think that there is general confusion here about what -fno-builtin does. It is *not* an attribute that affects the definition of builtin functions, it is a statement that the code being compiled in the current translation unit should change behavior.

Consider two translation units, A.c and B.c:

--- A.c, compiled with -fno-builtin-puts
void foo() {
  printf("hello world\n");
}

--- B.c, compiled *without* -fno-builtin.

void bar() {
  printf("goodbye world\n");
}

After LTO, we need to know that it is safe to optimize goodbye world into a call to puts, even though: 1) there is no prototype at all in the IR for puts, 2) two different functions (foo and bar) have different builtin-optimization-allowedness.

There are only two correct ways to model this: 1) an IR attribute on the function *bodies* being compiled in a translation unit, or 2) as an IR attribute on each *call* in the code being compiled in a translation unit.

If we want to be pedantically correct, #2 is really the right way to go, because then differences in this attribute would not affect inlining. For example, if we had:

--- C.c
void x() {
  foo();
  bar();
}

Then we should be able to inline both calls to printf into x(), then know that only one could be optimized.

-Chris