Where does llvm.memcpy.i64 and friends get lowered ?

Hi,

I am working on the COFF backend and am wondering where llvm.memcpy gets lowered to memcpy ?

It seems to be done by the assembler backends and would have thought it would either be done by a lowering pass or done by the mangler but it does not seem to be either.

Many thanks in advance,

Aaron

Hi,

I am working on the COFF backend and am wondering where llvm.memcpy gets lowered to memcpy ?

The instruction selector does this, it turns it into "ExternalGlobal" operands.

It seems to be done by the assembler backends

What code are you referring to?

-Chris

It's done by ISel. See SelectionDAG::getMemcpy.

-Eli

I am iterating through Modules symbols for 'test/CodeGen/X86/memcpy.bc

I get :-

---------- Functions ----------
llvm.memcpy.i64
Mangled name = llvm.memcpy.i64
DefaultVisibility
ExternalLinkage - Externally visible.
my_memcpy
Mangled name = my_memcpy
DefaultVisibility
ExternalLinkage - Externally visible.
my_memcpy2
Mangled name = my_memcpy2
DefaultVisibility
ExternalLinkage - Externally visible.
abort
Mangled name = abort
DefaultVisibility
ExternalLinkage - Externally visible.
----------- Globals -----------
----------- Aliases -----------

I am using this to define my symbols, but the relocations contain ‘memcpy’.

Aaron

2009/7/18 Eli Friedman <eli.friedman@gmail.com>

I am iterating through Modules symbols for 'test/CodeGen/X86/memcpy.bc

If you’re iterating over functions, just ignore all intrinsics.

-Chris

2009/7/18 Chris Lattner <clattner@apple.com>

I am iterating through Modules symbols for 'test/CodeGen/X86/memcpy.bc

If you’re iterating over functions, just ignore all intrinsics.

Okay, but it would be nice if the Module object reflected the lowered symbol names like ‘memcpy’ too. It seems a bit of an inconsistancy for the Module not to reference this and have to pick it up from the relocations.

Also there is the matter of ‘abort’ which seems to be defined by most if not all modules even if it is not being used. I am using GlobalValue::getUses() to filter it. Whats going on with ‘abort’ ?

Aaron

2009/7/18 Aaron Gray <aaronngray.lists@googlemail.com>

2009/7/18 Chris Lattner <clattner@apple.com>

I am iterating through Modules symbols for 'test/CodeGen/X86/memcpy.bc

If you’re iterating over functions, just ignore all intrinsics.

Okay, but it would be nice if the Module object reflected the lowered symbol names like ‘memcpy’ too. It seems a bit of an inconsistancy for the Module not to reference this and have to pick it up from the relocations.

Also there is the matter of ‘abort’ which seems to be defined by most if not all modules even if it is not being used. I am using GlobalValue::getUses() to filter it. Whats going on with ‘abort’ ?

In fact I cannot use getUses() as other “exported” symbols have zero uses.

My DiagWriter program is giving me :-

---------- Functions ----------
llvm.memcpy.i64
Mangled name = llvm.memcpy.i64
DefaultVisibility
ExternalLinkage - Externally visible.
Uses = 2
my_memcpy
Mangled name = _my_memcpy
DefaultVisibility
ExternalLinkage - Externally visible.
Uses = 0
my_memcpy2
Mangled name = _my_memcpy2
DefaultVisibility
ExternalLinkage - Externally visible.
Uses = 0
abort
Mangled name = _abort
DefaultVisibility
ExternalLinkage - Externally visible.
Uses = 0
----------- Globals -----------
----------- Aliases -----------

What is going on with ‘abort’, why is it nearly always there, but not always (if I am correct) ?

2009/7/18 Chris Lattner <clattner@apple.com>

I am iterating through Modules symbols for 'test/CodeGen/X86/memcpy.bc

If you’re iterating over functions, just ignore all intrinsics.

Okay, but it would be nice if the Module object reflected the lowered symbol names like ‘memcpy’ too. It seems a bit of an inconsistancy for the Module not to reference this and have to pick it up from the relocations.

If you’re iterating over the Module, you’re not dealing with the codegen level. We want the code generator to change as little as possible when doing codegen.

Also there is the matter of ‘abort’ which seems to be defined by most if not all modules even if it is not being used. I am using GlobalValue::getUses() to filter it. Whats going on with ‘abort’ ?

I have no idea what you’re talking about.

-Chris

2009/7/19 Chris Lattner <clattner@apple.com>

2009/7/18 Chris Lattner <clattner@apple.com>

I am iterating through Modules symbols for 'test/CodeGen/X86/memcpy.bc

If you’re iterating over functions, just ignore all intrinsics.

Okay, but it would be nice if the Module object reflected the lowered symbol names like ‘memcpy’ too. It seems a bit of an inconsistancy for the Module not to reference this and have to pick it up from the relocations.

If you’re iterating over the Module, you’re not dealing with the codegen level. We want the code generator to change as little as possible when doing codegen.

No I’m doing the COFFWriter, its just that symbols are being introduced via relocations, but AFAICT they are not defined in the Module. This seems to be happening with memcpy and friends.

Also there is the matter of ‘abort’ which seems to be defined by most if not all modules even if it is not being used. I am using GlobalValue::getUses() to filter it. Whats going on with ‘abort’ ?

I have no idea what you’re talking about.

There is a function symbol called ‘abort’ that is being defined, whether it is used or not. Its not defined in the ‘.ll’ file and useage count is zero, yet it is always nearly defined. I think there were a few test cases where it was not present, I will have to double check this.

Aaron

I am getting a simular problem with ___main appearing if ‘@main’ is used but there is no instance of it in the Module iterators, only in the relocations.

Is it possible to do something about these ‘ghost symbols’ and have them defined in Module. Rather than having to check for them and pick them up when processing relocations.

Aaron

2009/7/19 Aaron Gray <aaronngray.lists@googlemail.com>

I am getting a simular problem with ___main appearing if '@main' is used but
there is no instance of it in the Module iterators, only in the relocations.
Is it possible to do something about these 'ghost symbols' and have them
defined in Module. Rather than having to check for them and pick them up
when processing relocations.

Unfortunately, no. Such symbols are emitted during codegeneration only
and are target-specific. There is no tight connection with Module at
this time.

You should not be walking the module.

-Chris

2009/7/19 Chris Lattner <clattner@apple.com>

I am getting a simular problem with ___main appearing if ‘@main’ is used but there is no instance of it in the Module iterators, only in the relocations.

Is it possible to do something about these ‘ghost symbols’ and have them defined in Module. Rather than having to check for them and pick them up when processing relocations.

You should not be walking the module.

How else do you get the right symbol info ?

But as I have said lowering functions donot generate there substitute value symbols in the Module. They just magically appear in the relocations without being in the module. This is a design flaw in LLVM.

I have done a hack in the COFF writer’s relocation code to get round this for now. But it does not work very well with symbol indexes.

But am thinking of doing a cleanup pass, or walk the module to get this info.

What do you mean “You should not be walking the module.” ?

Thanks,

Aaron

2009/7/19 Chris Lattner <clattner@apple.com>

I am getting a simular problem with ___main appearing if ‘@main’ is used but there is no instance of it in the Module iterators, only in the relocations.

Is it possible to do something about these ‘ghost symbols’ and have them defined in Module. Rather than having to check for them and pick them up when processing relocations.

You should not be walking the module.

How else do you get the right symbol info ?

You see symbol definitions and references come through the code generator.

But as I have said lowering functions donot generate there substitute value symbols in the Module. They just magically appear in the relocations without being in the module. This is a design flaw in LLVM.

You do not understand how the code generator works.

I have done a hack in the COFF writer’s relocation code to get round this for now. But it does not work very well with symbol indexes.

But am thinking of doing a cleanup pass, or walk the module to get this info.

What do you mean “You should not be walking the module.” ?

You should not walk the function list of the module. You should not use Module* for anything other than handling file scope inline asm and walking the global variables.

-Chris

2009/7/19 Chris Lattner <clattner@apple.com>