Using '__attribute__((section("name")))' for inline assembly injection

I recently examined a bug in a program, and it turned out that the customer was using the section attribute as a form of inline-assembly mechanism, with something like:

attribute((section(“sectionName\nasm\nasm\nasm”)))

this was really ugly and not at all obvious where the problem originated. Is there any way of getting LLVM or CLang to validate the name used in the section attribute? The GCC definition says that this only permits alpha-numeric characters, but I know that it is also very common for people to use ‘_’ and ‘.’ in section names so alpha-numeric checking is probably a bit over-zealous, but embedded newlines definitely should be considered suspicious. Or perhaps there is an existing check in ‘clang-tidy’?

The reason they were doing this in the first place, is that the ‘naked’ attribute does not do what they want, which was to allow them to build custom prologue and epilogue code for a function.

Thanks,

MartinO

With clang, this does exactly what is requested. Note that GNU as will
give you a warning when using -save-temps or -S, but that's a separate
question.

Joerg

People also try to use attribute((section(".mysection,“rwx”,@progbits"))) to get executable code sections for toy experiments in self-modifying code, but we don’t support that either. Basically, we take the section name and always quote it, unlike GCC, and I don’t think we want to change that. Normally LLVM doesn’t round trip through assembly, so it would be infeasible to attempt to interpret strings in section names in all the ways that GCC allows through to gas.

My advice for users in this situation would be to try the naked attribute again, since we’ve fixed several bugs around it in the last two years. If that doesn’t work, adding a standalone assembly file to the project has always been and will continue to be the most reliable way to implement a function with a custom prologue.

Would it be useful for Clang to warn about section names with unusual characters?

-Hal

It’s tricky to define ‘unusual’ though. I’d prefer warnings for newline characters only, to be safe.

We have customers embedding the file name in section names… on Windows this gets you lovely names like “.bla.D:\Some\Path\xyz.cpp”, which are unusual but perfectly fine.

Tobias

I don't think it is common enough and you can always check the output
easily with readelf/objdump.

Joerg

I would think that it's very uncommon, however, it is also terribly
difficult to detect, and I'd argue that it's unlikely that someone actually
wishes to have newlines or other "non-printable"/"control" characters in
the section name. And if it's a warning that is enabled by default but
possible to turn off with "-wno-weird-sectionname", I'd say it would be
little harm - and not a huge maintenance burden.

Knowing to check that the section names contain weirdness or otherwise
debug "why the heck doesn't this code work in Clang, when it compiels
without warning, and is fine when compiled in gcc or whatever" is really
not at all easy.

I do realise that "every warning and such is a maintenance burden", but
although I have not encountered this problem, I'd definitely prefer a
warning than having to figure what went wrong...

My own preference is for a warning, that way the issue is visible and the programmer can choose to ignore it, disable it or make it an error as preferred.

I tried it with embedded NUL characters, and this turned out not to be a good idea. The example:

attribute((section(“grok\nfubar\0poisoned\tsnafu”)))

int foo() { return 42; }

with clang v3.8.0 for X86 ‘clang -Wall -S section.c’ I get the following output and no warnings:

.text

.def foo;

.scl 2;

.type 32;

.endef

.section grok

fubar,“xr”

.globl foo

.align 16, 0x90

foo: # @foo

.Ltmp0:

.seh_proc foo

BB#0: # %entry

.Ltmp1:

.seh_endprologue

movl $42, %eax

retq

.seh_handlerdata

.section grok

fubar,“xr”

.Ltmp2:

.seh_endproc

Since we do not have an integrated assembler, we have to go through the emit-assembly and separate invocation of the assembler, and such constructs fail at the next level.

I don’t know, perhaps when the compiler is writing object code directly it can somehow make the whole string meaningful as a section name, though I’m not sure how I could write an LD script for such sections.

I don’t advocate the programmer doing this kind of trick to force assembly code into the emitted assembly code; it seems very dangerous. The ‘attribute((naked))’ was changed a few versions ago so that a function defined with this attribute could not contain C source code. I haven’t experimented with it since then.

Thanks,

MartinO