#define LLVM_ASM_PREFIX_CHAR '\1' to make code more readable?

Hi,

llvm, clang and llvm-gcc contain numerous references to 1, '\1',
'\01', '\001', "\01" and "\01foo". This makes it difficult for one to
find all occurences of this prefix character and lacks a central place
where the prefix could be documented. (See for example
5480 – lli -force-interpreter tries to lookup symbols that still have \x01 in their name )

How about adding a #define for this prefix? The attached three patches
contain occurences of this prefix in llvm, clang and llvm-gcc that I
was able to identify but I don't know what would be a proper place for
the #define itself. Also should it maybe be LLVM_ASM_PREFIX or just
ASM_PREFIX? (I think both character and string versions are
needed. Afaik characters can not be concatenated to strings with
the preprocessor).

I am also not quite sure how the prefix should be documented. Does
"\01foo" mean "foo is a name that can be given to assembler or dlsym()
and should not ever be mangled in any way"?

llvm-asm-prefix1.patch (5.08 KB)

llvm-gcc-asm-prefix1.patch (2.76 KB)

clang-asm-prefix1.patch (30.4 KB)

Hi,

llvm, clang and llvm-gcc contain numerous references to 1, '\1',
'\01', '\001', "\01" and "\01foo". This makes it difficult for one to
find all occurences of this prefix character and lacks a central place
where the prefix could be documented. (See for example
5480 – lli -force-interpreter tries to lookup symbols that still have \x01 in their name )

How about adding a #define for this prefix? The attached three patches
contain occurences of this prefix in llvm, clang and llvm-gcc that I
was able to identify but I don't know what would be a proper place for
the #define itself. Also should it maybe be LLVM_ASM_PREFIX or just
ASM_PREFIX? (I think both character and string versions are
needed. Afaik characters can not be concatenated to strings with
the preprocessor).

Changing this from being a magic number to a 'static const char' is fine with me, please don't use a macro though. The declaration of this should go in Mangler.h

I am also not quite sure how the prefix should be documented. Does
"\01foo" mean "foo is a name that can be given to assembler or dlsym()
and should not ever be mangled in any way"?

\01 means 'do not add the "USER LABEL PREFIX" for the target', names definitely do need to be mangled for some targets even without this. For example, elf systems can't handle symbols with spaces in them.

-Chris

Chris Lattner <clattner@apple.com> writes:

Changing this from being a magic number to a 'static const char' is
fine with me, please don't use a macro though.

I do not like macros either. However, how should I remove magic
numbers from

} else if (Name == "\1stat64" ||
           Name == "\1lstat64" ||
           Name == "\1statvfs64" ||
           Name == "\1__isoc99_sscanf") {

without using a macro? Something like

} else if (Name[0] == llvm_asm_prefix && Name.slice(1, 0) == "stat64" ||
           Name[0] == llvm_asm_prefix && Name.slice(1, 0) == "lstat64" ||
           Name[0] == llvm_asm_prefix && Name.slice(1, 0) == "statvfs64" ||
           Name[0] == llvm_asm_prefix && Name.slice(1, 0) == "__isoc99_sscanf") {

does not look very nice.

The declaration of this should go in Mangler.h

Ok.

\01 means 'do not add the "USER LABEL PREFIX" for the target', names
definitely do need to be mangled for some targets even without this.
For example, elf systems can't handle symbols with spaces in them.

Thanks, with this description I was able to find

/* User label prefix in effect for this compilation. */
extern const char *user_label_prefix;

from llvm-gcc's output.h. Is llvm_asm_prefix still a good name for the
"\1" prefix?

Chris Lattner <clattner@apple.com> writes:

Changing this from being a magic number to a 'static const char' is
fine with me, please don't use a macro though.

I do not like macros either. However, how should I remove magic
numbers from

> } else if (Name == "\1stat64" ||
> Name == "\1lstat64" ||
> Name == "\1statvfs64" ||
> Name == "\1__isoc99_sscanf") {

without using a macro? Something like

> } else if (Name[0] == llvm_asm_prefix && Name.slice(1, 0) == "stat64" ||
> Name[0] == llvm_asm_prefix && Name.slice(1, 0) == "lstat64" ||
> Name[0] == llvm_asm_prefix && Name.slice(1, 0) == "statvfs64" ||
> Name[0] == llvm_asm_prefix && Name.slice(1, 0) == "__isoc99_sscanf") {

does not look very nice.

How about doing something like this above that code?

// Strip off asm prefix if present.
if (!Name.empty() && Name[0] == llvm_asm_prefix)
  Name = Name.substr(1);

I don't know if the empty check is necessary in context.

Thanks, with this description I was able to find

/* User label prefix in effect for this compilation. */
extern const char *user_label_prefix;

from llvm-gcc's output.h. Is llvm_asm_prefix still a good name for the
"\1" prefix?

How about llvm_disable_asm_prefix ?

-Chris