Hi all,

We have multiple DataLayout object in flight during a compilation: at least the one owned by the Module and the one owned by the TargetMachine.
There are two issues:
1) What if they differ? I guess we could assert at the beginning of CodeGen.
2) The DataLayout has internal mutable state (a cache of StructLayout).

The latter is my current concern: the cache in DataLayout is based on Type pointer which means that is has to be Context specific.
This is fine with the DataLayout attached to the Module but not with the TargetMachine.
The cache could live in the Context, but it is also DataLayout specific so it wouldn’t be a good fit either.

I considered modifying the cache in the DataLayout to invalidate itself when queried with a new Context, however it wouldn’t work in a multi-threaded environment (or the query would have to be behind a Mutex).

Another option is to remove the DataLayout from the TargetMachine, I tried it and it seems possible but awkward as well in some places where the DataLayout has to be supplied externally because TargetLoweringXX would not have it.

Finally the TargetMachine could be deemed to be "context-specific”, i.e. the CodeGen infrastructure would need to be reinitialized for each new Context. That would be a strong limitation though.

There may be other options?

Long term I think we should keep only the one in the module.

Agreed. I’ll reply to the rest of it soon, but if we have a required one in the module then that’s what we should use.


They come in three. The third DataLayout is allocated in CodeGeneratorImpl::Initialize() and used in CodeGenModule::getDataLayout(). It may be possible to simply redirect it to the module DataLayout as well, I had not tested it. The duplication wastes cpu cycles as StructLayouts are re-computed repeatedly in the different DataLayout for different clients.

In the meantime, I put together a crude “quick” patch that removes access to the DataLayout owned by the TM in favor of the one hold by the Module: