Strange character around function declared with "asm"

Hi everyone,

I am trying to instruct clang to generate llvm intrinsics in place of
certain function calls, and to do so I am using the "asm" tag. For
example:

void MyFunc(int a, int b) asm("llvm.myfunc");

void DoSomething() {
  int x = 1;
  int y = 2;

  MyFunc(x, y);
}

However, if I try to compile the above function with "clang -emit-llvm
-S test.c -o test.s" I obtain:

define void @DoSomething() nounwind {
  %x = alloca i32, align 4
  %y = alloca i32, align 4
  store i32 1, i32* %x, align 4
  store i32 2, i32* %y, align 4
  %1 = load i32* %x, align 4
  %2 = load i32* %y, align 4
  call void @"\01llvm.myfunc"(i32 %1, i32 %2)
  ret void
}

declare void @"\01llvm.myfunc"(i32, i32)

Is there a reason for the quotation marks (") and the \01 before the
intrinsic call? I tried to get the same assembly from llvm-gcc, and in
that case there are no quotation marks or other artefacts.

Thanks,
Lorenzo

Apparently clang is mangling it this way on purpose as far as I can
tell... I'm not really sure why though.

John?

-eric

The gnu "asm renaming" extension says that USER_LABEL_PREFIX is not prefixed onto the symbol name. The \1 prefix on the LLVM name tells the backend (through the 'Mangler' class) not to add it.

-Chris

Interesting. llvm-gcc bug then?

-eric

What's the bug?

-Chris

There's no \01 prefix in the llvm-gcc version of the module.

-eric

"asm" means actual assembly, not IR. That prefix is an escape which tells LLVM to emit the symbol with exactly that name.

AFAIK, there's no way to get clang to emit a user symbol with the name @llvm.myfunc. We could probably special-case support for this if we see an asm label with a prefix of "llvm." or "clang.", but I'm not sure it's a particularly good idea.

John.

Apparently clang is mangling it this way on purpose as far as I can
tell... I'm not really sure why though.

The gnu "asm renaming" extension says that USER_LABEL_PREFIX is not prefixed onto the symbol name. The \1 prefix on the LLVM name tells the backend (through the 'Mangler' class) not to add it.

Interesting. llvm-gcc bug then?

What's the bug?

There's no \01 prefix in the llvm-gcc version of the module.

On linux or darwin? IIRC, on linux there is no USER_LABEL_PREFIX so it doesn't matter if there is a \1 or not.

-Chris

darwin.

Just checked out of curiosity :slight_smile:

-eric

Ah, I didn't notice that the name was "llvm.foo". It is entirely possible that llvm-gcc has a special hack for symbols that start with "llvm.", but I don't see it with a quick look. This is dangerous though, because the intrinsics change over time, and not getting the right argument types would cause the middle-end to blow up.

-Chris

OK. If you file it I'll fix it :slight_smile:

-eric

Thanks everyone. I guess the question then becomes how can I instruct
CLANG to emit calls to user-defined intrinsics when it generates LLVM
bytecode.

Does anyone have a suggestion? Or if someone can point me to the code
that generates the asm code, I can do some dirty hack (without
submitting it!) to manage the situation where the symbol start with
"llvm.". As it was said, it is probably not a good idea in general,
but it would be good enough for my purpose :slight_smile:

Thanks,
Lorenzo

Which intrinsic is it? Is this something you're adding to LLVM, or something standard?

-Chris

Chris,

It is a bunch of intrinsics that I defined myself, and it only makes
sense with the code I am compiling.

Lorenzo

If you're running a post-processing step on the IR anyway, just give your intrinsics ordinary function names.

John.

Uhm... what do you mean? Post-processing the IR code with an external
script? I personally think it would be a much cleaner solution to
generate the intrinsics within CLANG. I googled for a while but the
only thing I found was the "asm" solution, which as I said seems to
work in llvm-gcc but not in CLANG. So I was wondering if there is an
alternative way which does not involves going outside CLANG/LLVM.

Lorenzo

Well, I assume by "intrinsic" you mean that you can't actually *run* compiled
code containing these calls without further processing, because these functions
don't actually exist in your runtime. So somewhere downstream you're going
to examine or modify the IR, and in that case, there's no reason that code can't
look for a function called "LorenzosMagicFunction" instead of a function called
"llvm.lorenzos.magic.function". There's no rule saying that special functions
have to have names starting with "llvm."; the existing intrinsics are only special
because some code somewhere knows to look for functions with those names.

John.

John,

This may be a solution, but I feel it is somewhat limited. If someone
could point me to the portion of code that generates the operands for
call, I could do a quick hack to avoid prefixing symbols when they
start with "llvm." (as it is probably done in llvm.gcc). I acknowledge
that it is a ugly hack, but as I said it would be good enough for my
purpose (and I would not submit it upstream anyway).

Thanks,
Lorenzo

It's not specific to calls; what you want is the code that sets up the
modified function name, which is lib/CodeGen/Mangle.cpp:322.

John.