The discussion of malloc optimization from two (three?) weeks ago
prompts me to be concerned about implications for kernel compilation.
Basically, I have two questions:
1. The particular optimization that was done there was based on the
compiler substituting an alternate implementation of malloc(). This may
not be appropriate in kernel or deeply embedded systems. Is there a way
for someone who is building that sort of system to enable/disable the
builtin library magic selectively?
I assume that LLVM can support compiling for a "freestanding" C library,
and that this should disable everything of this sort, but it would be
useful to be able to then selectively re-enable things like memcpy.
2. The particular optimization in that case was also based on knowledge
that main() is not re-entrant, main() called f() once, f() called
malloc() once, and therefore that call to malloc() could be partially
evaluated at compile time. In kernels and such, main() is not the only
entry point and this particular optimization can't be done safely
without a whole-program view of all entry points.
What options do we need to provide to deal with this sort of thing?
shap
1. The particular optimization that was done there was based on the
compiler substituting an alternate implementation of malloc(). This may
not be appropriate in kernel or deeply embedded systems. Is there a way
for someone who is building that sort of system to enable/disable the
builtin library magic selectively?
-ffreestanding
I assume that LLVM can support compiling for a "freestanding" C library,
and that this should disable everything of this sort, but it would be
useful to be able to then selectively re-enable things like memcpy.
2. The particular optimization in that case was also based on knowledge
that main() is not re-entrant, main() called f() once, f() called
malloc() once, and therefore that call to malloc() could be partially
evaluated at compile time. In kernels and such, main() is not the only
entry point and this particular optimization can't be done safely
without a whole-program view of all entry points.
What options do we need to provide to deal with this sort of thing?
I don't think the logic is based on "main", but on whether functions
are marked "internal" or not. That said, GlobalOpt.cpp seems to reason
based on the name "main" (most likely bogus, because of constructors/
destructors running before main).
Ciao,
Duncan.
Umm. That sounds like a test case that really wants to get written.
Does the LLVM C front-end support the "init" attribute for C functions
(which has similar effect)?
shap
1. The particular optimization that was done there was based on the
compiler substituting an alternate implementation of malloc(). This may
not be appropriate in kernel or deeply embedded systems. Is there a way
for someone who is building that sort of system to enable/disable the
builtin library magic selectively?
-ffreestanding
The flag to disable assumptions about library functions selectively is
-fno-builtin, if that's actually what you want. -ffreestanding is not quite the same;
it also disables special semantics for "main" and changes the predefinition of
__STDC_HOSTED__.
After compiling a few kernels with llvm (with LTO), I haven't seen one
yet that names any of it's allocators 'malloc', so in that case it's
kind of a moot point.
As for the matter of entry points, this isn't a problem either lots of
things in a kernel (if properly marked 'used') don't get internalized,
so you don't wind up with a bytecode that has a single entry point
named 'main'. Oh, and main isn't a popular name for kernel entry
anyway.
Andrew
I don't think the logic is based on "main", but on whether functions
are marked "internal" or not. That said, GlobalOpt.cpp seems to reason
based on the name "main" (most likely bogus, because of constructors/
destructors running before main).
Umm. That sounds like a test case that really wants to get written.
huh?
Does the LLVM C front-end support the "init" attribute for C functions
(which has similar effect)?
Yes, we fully support it. I'm not sure why you think there is a bug here. As several people have told you, LLVM is doing the right thing, and if you want to build a kernel or something without libc, use -ffree-standing. There are serveral existence cases of kernels built with LLVM.
-Chris
>> I don't think the logic is based on "main", but on whether functions
>> are marked "internal" or not. That said, GlobalOpt.cpp seems to reason
>> based on the name "main" (most likely bogus, because of constructors/
>> destructors running before main).
>
> Umm. That sounds like a test case that really wants to get written.
huh?
Duncan asserted that there is a behavior in GlobalOpt.cpp that relies on
main being the single entry point. If he is mistaken, great. If he is
correct, a regression test case for this seems like a useful thing to
build. I'm not saying that somebody else should do it. In fact, it seems
like a fine way to start getting my hands dirty.
> Does the LLVM C front-end support the "init" attribute for C functions
> (which has similar effect)?
Yes, we fully support it. I'm not sure why you think there is a bug here.
As several people have told you, LLVM is doing the right thing, and if you
want to build a kernel or something without libc, use -ffree-standing.
There are serveral existence cases of kernels built with LLVM.
I did not say that I thought there was a bug here. That was Duncan. I
*asked* if LLVM assumes that main is the only entry point because of the
earlier malloc() discussion, which seemed to rely on knowing the control
flow from all possible entry points.
Since you seem to find my participation in the list excessive, and since
I would actually like to comply with the intended list policy, perhaps
you would be good enough to publicly answer the following question:
Are questions that are intended to help users develop an understanding
of the LLVM internals and assumptions considered "within bounds" on
this list or not?
If not, I will be happy to stop asking them.
shap
Yes, globalopt does that.
You're asking two questions:
1. Does llvm incorrectly assume things that break C++ constructors and the GNU extensions.
2. Does llvm assume things that break kernels.
The answer to #1 is no. The answer to #2 is "not if you use -ffree-standing". Why do you care about globalopt.cpp specifically?
-Chris
Only because Duncan mentioned it.
> The answer to #1 is no. The answer to #2 is "not if you use
> -ffree-standing". Why do you care about globalopt.cpp specifically?
Only because Duncan mentioned it.
And I only mentioned it because it is the only transform in
Transforms/IPO which looks explicitly for "main". I didn't
take the time to see what it is doing exactly - perhaps the
reliance on special properties of "main" can be removed (which
would be nice).
Ciao,
Duncan.