MachineInstr: external symbols problem

Hello,
I just wrote the code like this:

BuildMI(BB, NM::CALL, 1)
                    .addExternalSymbol(("_lvksda_control_marker_"
                                       + lexical_cast<string>(bb)).c_str());

and got some unexpected string in the assembler output. The problem is that
when external symbol is added to MachineInstruction, MachineOperand is
created with the char* argument, but the argument is not copied, it's just
stored in MachineOperand::symbolName. So, all long as string temporary is
destroyed, that pointer points to garbage.

I suppose I could handle this by adding function to LLVM module and
using .addGlobalValue, but still, is the non-copying behaviour of constructor
by design?

- Volodya

Yes, this is by design. MachineOperand is supposed to be a very light-weight union. We actually used to have it carry an std::string around, but this required conditional code to make sure to free it and copy it when neccesary.

If you'd like to do this, there are a couple of options: first, you could build the table in the asmprinter, and have the operands point into the string table. Second, you could add the functions to the Module, though that is less appealing (ideally the code generator would not hack on the LLVM module at all... we're not there yet, but slowly getting there).

Another thing that might be interesting is the new llvm.pcmarker intrinsic that Andrew recently added. I have no idea if it would be useful to you or not, but... it's documented here:
http://llvm.cs.uiuc.edu/docs/LangRef.html#i_pcmarker

-Chris

> Hello,
> I just wrote the code like this:
>
> BuildMI(BB, NM::CALL, 1)
> .addExternalSymbol(("_lvksda_control_marker_"
> +
> lexical_cast<string>(bb)).c_str());
>
> and got some unexpected string in the assembler output. The problem is
> that when external symbol is added to MachineInstruction, MachineOperand
> is created with the char* argument, but the argument is not copied, it's
> just stored in MachineOperand::symbolName. So, all long as string
> temporary is destroyed, that pointer points to garbage. I suppose I
> could handle this by adding function to LLVM module and using
> .addGlobalValue, but still, is the non-copying behaviour of constructor
> by design?

Yes, this is by design. MachineOperand is supposed to be a very
light-weight union. We actually used to have it carry an std::string
around, but this required conditional code to make sure to free it and
copy it when neccesary.

Yea, could be tricky with union.

If you'd like to do this, there are a couple of options: first, you could
build the table in the asmprinter, and have the operands point into the
string table.

Looks reasonable.

Second, you could add the functions to the Module, though
that is less appealing (ideally the code generator would not hack on the
LLVM module at all... we're not there yet, but slowly getting there).

Yea, I understand that. That's why I've asked.

Another thing that might be interesting is the new llvm.pcmarker intrinsic
that Andrew recently added. I have no idea if it would be useful to you
or not, but... it's documented here:
http://llvm.cs.uiuc.edu/docs/LangRef.html#i_pcmarker

In fact, I'm having problems exactly while lowering the pcmarker intrinsic :wink:
Need to get in to assembler somehow, and I already have a code that parses
assembler and recognised calls to functions with certain name patterns as
markers.

- Volodya

It looks like the Alpha backend treats these as pseudo instructions:

def PCLABEL : PseudoInstAlpha<(ops s64imm:$num), "PCMARKER_$num:\n">;

maybe you could do something similar? If not, I think the string table is the way to go...

-Chris

Interesting trick. It looks like '$num' inside the string is interpreted
automatically. Is it matches with "$num" in the first argument? If the first
argument is DAG, will it work if my backend does not use pattern matching?

Thanks,
Volodya

Yup, exactly. This is provided by the asm-printer-generator, so if you're using it, you should be good. You don't need to be using the new dag isel stuff to use this. If you look at the targets in mainline, the printInstruction method is automatically generated for them, which takes care of this.

-Chris

Note that due to the non-existance type checking in the asm-printer,
$num could just as well be a label added with .addGlobalSymbol(...) and
it would do the substitution with that. The s64imm is just a hint
really (probably difference in the JIT, but if you are not worried about
that...).

Andrew