clang++: std::is_aggregate unusable with clang-5.0/libstdc++-7

Dear developers,

As of r307243, clang++ seems to have the __is_aggregate() builtin
macro, but __has_builtin(__is_aggregate) returns false. This is
rendering std::is_aggregate from libstdc++-7 unusable with clang++.

I have tested the code below with clang-5.0=5.0~svn307243-1 package from
apt.llvm.org on a debian unstable box.

#include <iostream>

int main(void){
  std::cout << "__has_builtin(__is_aggregate) = " << __has_builtin(__is_aggregate) << std::endl;
  std::cout << "__is_aggregate(int[42]) = " << __is_aggregate(int[42]) << std::endl;
  std::cout << "__is_aggregate(std::string) = " << __is_aggregate(std::string) << std::endl;
  return 0;
}

result:
__has_builtin(__is_aggregate) = 0
__is_aggregate(int[42]) = 1
__is_aggregate(std::string) = 0

Dear developers,

As of r307243, clang++ seems to have the __is_aggregate() builtin
macro, but __has_builtin(__is_aggregate) returns false. This is
rendering std::is_aggregate from libstdc++-7 unusable with clang++.

Is this a regression? Also, can you file a bug at bugs.llvm.org and
add me as a cc: tstellar@gmail.com.

Thanks,
Tom

Dear developers,

As of r307243, clang++ seems to have the __is_aggregate() builtin
macro, but __has_builtin(__is_aggregate) returns false. This is
rendering std::is_aggregate from libstdc++-7 unusable with clang++.

__has_builtin detects builtin functions. __is_aggregate is not a builtin
function -- it's not a function at all, since it takes a type, not a value.
But if libstdc++7 is assuming that __has_builtin can be used to detect type
trait keywords, perhaps we should make it so; it's not unreasonable to
expect it to be usable for this purpose.

Would you be interested in providing a patch to Clang to implement this?

I have tested the code below with clang-5.0=5.0~svn307243-1 package from

__has_builtin detects builtin functions. __is_aggregate is not a builtin
function -- it's not a function at all, since it takes a type, not a value.
But if libstdc++7 is assuming that __has_builtin can be used to detect type
trait keywords, perhaps we should make it so; it's not unreasonable to
expect it to be usable for this purpose.

I see. Sorry to have misunderstood.

Would you be interested in providing a patch to Clang to implement this?

I don't know much about Clang's codebase, so I think it's better to send
one to libstdc++. But how can I detect if __is_aggregate macro is
usable? I've tried #ifdef directive but with no luck.

I think !__is_identifier(__is_aggregate) is the only way we have to detect
this right now.

I’ve passed this information Jonathan Wakely so that he can correct it.

/Eric

Dear friendly Clang-World,

currently I try some variation of a Debugger. I have one process, which starts another process and gains access to its memory. Like a debugger, this process now should read the value of a global variable directly from the other process. But I don’t know the address. So I pleased the clang-cl compiler to generate me a dwarf-debug-file. But in the end, I will have only dwarf files for the .obj-files, but the .exe will have a pdb-file. Using the llvm pdb-dump didn’t help me finding the address, because the application will crash while dumping. The single-dwarf file for my object file didn’t helped too. So I wanted to ask:

1.) Is there a way how I could gain the address of the global variable from any debug-format under Windows using clang-cl?
2.) Can I use a single dwarf-file with its .obj-file to gain an address? This would be interessting for jitting this obj-file.

Kind regards
Björn Gaier
Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789
Geschäftsführer: Hiroshi Kawamura, Dr Hiroshi Nakamura, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima.

  1. With a PDB, you can use the DbgHelp library on Windows to enumerate the executable’s symbols and their relative addresses within the executable (see http://bit.ly/2vD5aG1)
  2. No. Assembling object files into an executable is not a one-to-one thing. Functions are moved around, duplicates and unused deleted, and so forth. Since the linker doesn’t store this mapping – except in the form of a final debug data for the executable – mapping the individual objects’ debug data onto the executable is generally neither feasible nor recommended.

Hello friendly Clang-World,

I was experimenting with Clang and the JIT capabilities of LLVM. Most of my attempts were successfully but, I still fail miserably at exceptions. Doing research I found the function “registerEHFrames()” which should assist me supporting exceptions - but sadly the documentation I found wasn’t helpful.
I looked at into the “notifyObjectLoaded” function and discovered that there appear some symbol names starting with “$” - I expected them to be connected to my try and catch block. But what now? As usually, at this point I have there names, but can’t get there address to register them with the “registerEHFrames()” function. Also the JITTER still wants an address for “??_7type_info@@6B@” which is the virtual table of the type_info struct.

Confusing! So friendly Clang-World, could you please help?

Not so important - but has the dragon which decorates clang and LLVM a name?

Kind regards
Björn
Als GmbH eingetragen im Handelsregister Bad Homburg v.d.H. HRB 9816, USt.ID-Nr. DE 114 165 789
Geschäftsführer: Hiroshi Kawamura, Dr Hiroshi Nakamura, Markus Bode, Heiko Lampert, Takashi Nagano, Takeshi Fukushima.

Hello friendly LLVM-World,

because I don’t know if I had send my problem to the correct Mailing-List, I will send my problem to this address too. I’m not subscribed to this list, so please add my in CC if you response.

Kind regards
Björn

Hi Björn

To first answer your questionin the subject: For x86 registerEHFrames() is only a stub. For x86_64 registerEHFrames() is implemented properly in RuntimeDyldCOFFX86_64, calling MemMgr.registerEHFrames() for each EH frame section. It should be called and work out of the box without your involvement, but unfortunately it won’t solve your issue. All the essential information is there in the comments, just check the base classes.

This thread from last year helps with your unresolved symbol:
Back then I tried to solve a related issue: SEH exceptions thrown from code loaded with RuntimeDyld had to be caught in statically compiled code. It turned out Windows explicitly prohibits this. I got in touch with Microsoft people and IIRC it’s due to security concerns. Depending on your specific case, you may want to fall back to If you are willing to do research, compare implementations and behavior with the MachO and ELF versions. At least one of them works, just not on Windows :wink: Also check the LLILC project: I heard about some solution that uses trampolines to push exceptions back to dynamically loaded code and handle them there. AND disclaimer: I did not follow recent developments in this area. If there’s news please let me know! Cheers & Good Luck! Stefan

Hello Stefan,

I’m happy someone replied to my problem! Many thanks! To be honest… I didn’t understood much of your mail. I’m a beginner with the JIT - so I will explain what I’ve done.

To manage the memory and resolve symbols, I’m using my own Resolver-Class, which overloads the allocation and the findSymbol functions. I’ve noticed today, that the “registerEHFrames” function of my class gets called automatically, with correct values. I’m remapping my code and the address are still correct. Great! But, what should I do with it? I pass the values to the original function, but my exception won’t be caught! It’s an exception raised inside the JITTED code and should also catched there.

I tried loading the “msvcrt.lib” as a archive. That was… a bad idea! I get a Exception while loading:
Assertion failed: ((int64_t)Result <= INT32_MAX) && “Relocation overflow”, file \lib\executionengine\runtimedyld\Targets/RuntimeDyldCOFFX86_64.h, line 81

Research didn’t helped me! My code was compiled with /MD, but it didn’t changed. So I’m still stupid D:
The JITTED code must be loaded to shared memory later - there aren’t libraries, so even if this would work, it wouldn’t help me. I tried compiling my code with sjlj-exceptions. Didn’t worked…

Is there no hope left?

Kind regards
Björn

I tried loading the “msvcrt.lib” as a archive. That was… a bad idea! I get a Exception while loading:
Assertion failed: ((int64_t)Result <= INT32_MAX) && “Relocation overflow”, file \lib\executionengine\runtimedyld\Targets/RuntimeDyldCOFFX86_64.h, line 81

It’s a limitation of the COFF/PE format and unrelated to exceptions. This patch explains it and shows a workaround:
Well at least I am not aware of a solution. Am 28.09.17 um 16:04 schrieb :

Hi Bjoern,

I’m trying to make exceptions run. I have an Object file with a function, throwing a 1 and a second function which should catch the 1. Normal JITTING under Windows showed me, that I have an unresolved reference to the virtual table of type_info. Some experiments later I was able to load “msvcrt.lib” as an archive and could resolve the reference. Nice - but than “??_Etype_info@@UEAPEAXI@Z” was missing too. Ufff! I decided to ignore this. Because, when I try to load every .lib and .obj provided by Visual Studio, I get an assertion failure with “Relocation type not implemented yet!”.

RuntimeDyldCOFF is missing a lot of relocation support. We need a COFF expert to fix that and unfortunately I’m not one.

I decided to have a look at “_CxxThrowException”. I inserted my own function for the JIT and had a look at the parameters. I got two of them. The first was the address of the Exception-Object, which was correct. The second is the address for the “_ThrowInfo”. This address was valid too, but all its members - except from attributes - are null. So I can’t throw this Exception. I tried to pass the address of typeid(1) to it, or modificate the call. Nothing helped.

I don’t know windows exception handling well, but if it’s anything like DWARF EH then I’d be inclined to blame the missing relocations/fixups – the _ThrowInfo struct (or whatever data-source ultimately populates it) probably isn’t being fixed up.

I have no clue and no idea anymore. So… Do you have an idea?

I’m afraid I don’t personally. We need some windows linker / system experts to take an interest in the JIT.

Back then I tried to solve a related issue: SEH exceptions thrown from code loaded with RuntimeDyld had to be caught in statically compiled code. It turned out Windows explicitly prohibits this. I got in touch with Microsoft people and IIRC it’s due to security concerns.

Stefan – That’s an interesting restriction. :confused:
Does it prohibit exceptions thrown in JIT’d code from being caught also in JIT’d code, or does it only apply if an exception crosses the boundary back into statically compiled code?
Do you know how they were enforcing that?

Cheers,
Lang.

Hi, I checked last year’s mails. Back then Timur and me faced the same issue Björn reports (nulled memory in _ThrowInfo).

Back then I tried to solve a related issue: SEH exceptions thrown from code loaded with RuntimeDyld had to be caught in statically compiled code. It turned out Windows explicitly prohibits this. I got in touch with Microsoft people and IIRC it’s due to security concerns.

Stefan – That’s an interesting restriction. :confused:
Does it prohibit exceptions thrown in JIT’d code from being caught also in JIT’d code, or does it only apply if an exception crosses the boundary back into statically compiled code?
Do you know how they were enforcing that?

Igor Minin explained this was due to Data Execution Prevention (DEP) and gave pointers to the involved functions in the OS runtime:
The whole thing gets hairy quickly. It seems like a combination of multiple problems. So let’s first collect evidence and clarify the goal. Joseph Tremoulet from Microsoft partially confirmed Igor’s statements: . In theory both, JITed and static catch handlers, may be possible here. My next guess was an incompatibility between Clang (we used for JITing) and MSVC (we used for static compilation). Joseph:

I'm catching up on this. Does this mean LLVM x64 JITTed code is not
exception friendly or you can't catch exceptions inside LLVM JITTed
code. The first one seems to indicate that the code is not ABI
friendly or that not enough information is present to notify Windows
of unwind tables.

I'll ask the question another way: Does LLVM emit enough information
so that RtlAddFunctionTable can work? Anyone successfully done this in
the wild?

In partial answer to your question, I can state that JuliaLang has been successfully registering and handling exceptions in JIT code since LLVM 3.3, when I implemented support (with varying levels of hackiness over time to support all of the JIT and other changes to LLVM since then). However note that to hook it all up we use the ELF format and have our own exception format + personality routine and a custom memory manager.

That's encouraging.

Assuming that all access to the JITted code is going to be done
through a function pointer, and all JIT code is ephemeral, why is the
object container format important? In fact, why is it even needed? I
found it somewhat odd that MCJIT generates an object file for even the
JIT case.

To answer my own question, could it be that advanced JIT's may
need/want to use things available inside the object code format?

So I guess my real question is what are the practical limitations of
using the ELF object code format to generate JIT code when all
observable access to the code will be through its entry point.

P.S. - I'm also assuming that the ELF object container format does not
drive the code-generation triple, i.e. we generate Windows ABI
Compatible code even when using the RuntimeDyIDELF.

I read through Julia code that Jameson has written and it seems like
the unwind info is hand generated i.e. it is not coming from LLVM
directly because the generated prolog/epilog is the same always? I'm
confused.

So then I read through this bug report:
https://bugs.llvm.org/show_bug.cgi?id=24233

Alexey Zasenko mentions that relocation support is incomplete. Then
Andy Ayers from Microsoft says that the PE file format, and therefore
the corresponding RUNTIME_FUNCTION data structure only supports
code-loaded within a 4GB extent. He doesn't mention it explicitly but
it would follow that Microsoft compilers do not generate images that
are > 4GB in size because Windows cannot load them. It would then not
be a stretch that Microsoft's JIT compilers are also not capable of
producing code more than 4GB apart. Would it also be safe to say that
they can't generate more than 4GB of code?

From the bug report Andy's next set of comments are around how to work

around this problem. It seems they (Microsoft JIT compiler) don't need
to care because they don't ever generate code of such volumes or that
far apart?

But then Stefan Gränitz suggests a solution that somehow accommodates
this > 4GB situation. It would seem that this is accomplished by
emitting a relocation of type: IMAGE_REL_AMD64_ADDR64

What is curious is that according to

Microsoft compilers rarely generate this type of relocation but they
can do it.

After reading all of this, I still don't have a clear picture, so I'm
writing a summary here and maybe somebody can refute what I'm saying
or point in the right direction

Options to make progress for at least some JIT users:

(1) Use the RuntimeDyIDELF and borrow code from JuliaLang/Julia where
Jameson has seemingly figured out what the UNWIND_INFO is for some set
of prolog/epilog -- unclear if this can break or not.
(2) Use small code model and have the application embedding the jit
ask the OS for some committed space range. This hopefully makes it so
that everything is within 32-bits.
(3) Figure out why IMAGE_REL_AMD64_ADDR32NB is ever emitted and why
Microsoft's JIT Compiler doesn't seem to be needing it. Alexey Zasenko
has a patch, does that need to be upstreamed?
(4) Stefan Gränitz has a patch that can solve the 64-bit problem a
different way.

And I'm looking for if my reasoning is sound for (5) and if indeed
this will work I can't think of a reason why it won't:

(5) Generate code function-by-function (i.e. 1 function per LLVM
module) and each time ask the memory manager to return a memory chunk
so that XDATA is within 32-bit of the function.

I suppose what is puzzling to me most is why (5) is not enough? I mean
assuming you're writing a JIT compiler with lazy compilation how is
that you'll ever generate code whose XDATA is > 4GB apart than the
code. And if you have direct references to full 64-bit addresses, i.e.
you're calling some previously jit compiled function or reference some
data structure ... they don't need to be relocated at all. Or is this
where I'm wrong? And that in fact the direct references do need to be
relocated? But that wouldn't make sense because how does anyone else
know what to relocate it to?

Appreciate any help or direction!

Thanks, H.

Hi Hayden, on a few of your points:

But then Stefan Gränitz suggests a solution that somehow accommodates
this > 4GB situation. It would seem that this is accomplished by
emitting a relocation of type: IMAGE_REL_AMD64_ADDR64

There's IMAGE_REL_AMD64_REL32 relocations in msvcrt.lib - it causes
relocation overflows whenever it's loaded with RuntimeDyld.

(4) Stefan Gränitz has a patch that can solve the 64-bit problem a
different way.

It forwards to stub functions within 32-Bit distance, which emit
IMAGE_REL_AMD64_ADDR64 relocations themselves.

Does this mean LLVM x64 JITTed code is not exception friendly or you can't catch exceptions inside LLVM JITTed code.

Double-checked on basis of LLI-5.0 on the weekend: on WINDOWS x64 you
can't catch exceptions from JITed code at all.