Chris Lattner wrote:
> And why isn't it possible to just make those functions known to LLVM?
> After all, *I think*, if this function is to be called, it should be
> declared in assembler, and so you have to pass some information abou
> those function to the code printer. (Of course, it's possible to just
> directly print the declarations, but that's scary).
If you wanted to do that, it would be fine. Be aware that the code
generators are set up as function passes though, so you would have to
insert all function prototypes in the doInitialization(...) method of the
function pass: you can't just do it on the fly from runOn*Function.
Yes, I understand that.
The real reason that we aren't doing this currently is that we don't want
code generators to be hacking on the LLVM module. This greatly interferes
with JIT-style multi-pass optimization and other things. Unfortunately,
we are a long way from this though, as the lowering passes hack on the
LLVM and other stuff does as well. Unless you have a good reason to do
so, I would suggest trying to use MO_ExternalFunction just to make future
refactoring easier.
I think I more or less understand this motivation.
> There's another issue I don't understand. The module consists of
> functions and constants. I'd expect that external function declarations
> are also constants, with appropriate type. However, it seems they are not
> included in [Module::gbegin(), Module::gend()], insteads, they a Function
> objects with isExternal set to true.
Module::gbegin/gend iterate over the global variables, and ::begin/end
iterate over the functions, some of which may be prototypes. Function
prototypes aren't really any more "constant" than other functions are.
I disagree. Say there's declaration of external function "printf". Then it's
just a constant global address. In assembler it will be
extern printf: label;
which is not that different from assembler for other constants. For example,
for external data reference I have to produce the same assembler.
BTW, there's inconsistency in how X86 backend handles constants and functions.
Consider:
%.str_1 = constant [11 x sbyte] c"'%c' '%c'\0A\00"
implementation ; Functions:
declare int %printf(sbyte*, ...)
int %main() {
entry:
%tmp.0.i = call int (sbyte*, ...)*
%printf( sbyte* getelementptr ([11 x sbyte]* %.str_1, long 0, l
ret int 0
}
The assembler produces by X86 backend is:
call printf
........
.globl _2E_str_1
.data
.align 1
.type _2E_str_1,@object
.size _2E_str_1,11
_2E_str_1:
That is, the name of "str1" is mangled, but the name of function is not. I
don't see the reasons for different handling of those two kinds of names.
> To me this seems a bit confusing -- it would be clearer if there we plain
> functions with bodies and everything else were GlobalValue.
The reason that we don't want to do this is that it makes it more
difficult to create a function and then fill in its body. Currently when
you create a function, you get a prototype. When you fill in its body,
you now have a defined function. In your scheme, the function prototype
and defined function objects would be different: to go from one to the
other, you would have to delete the object and reallocate it.
Can't you store all functions in the list of global values? That would be
quite clear: all top-level module elements are global values, and a present
in the global list.
The functons list can contains either both functions with bodies or without,
or only with bodies. In the latter case, when you create function, it's added
only to global values list. When you add the first basic block, it's also
added to the list of functions.
> Anyther question is about SymbolTable. Is it true that it's a mapping
> from name to objects in Module, and than all objects accessible via
> SymbolsTable are either in the list of functions or in the list of global
> values?
Yup. There are also function-local symbol tables as well.
I wouldn't recommend depending too much on the names, because LLVM has a
unusual mechanism where it allows objects with different types to have
the same name. This means you can have:
int %foo(int %X) { ret int %X }
float %foo(float %X) { ret float %X }
In the context of a code generator, you should use the NameMangler
interface to make everything just work.
If you're doing something else and think you need the symbol table, please
let me know. Clients of the SymbolTable class are extremely rare (by
design). The SymbolTable class is mostly an internal class that is
automagically used by the system to provide naming invariants and allow
efficient lookup for the rare clients that need it.
Thanks for explanation. I don't have a use of SymbolTable yet, I was just
wondering if I have to use it for something 