Avoiding duplications within an interpreter switch

Hi All,

I’m a happy clang user across several platforms for Smalltalk VM development.

One version of the VM is an interpreter that supports two bytecode sets. Its main dispatch loop combines dispatches for two bytecode sets, one offset by 256 from the other. So the range of cases in the switch is from 0 to 511. Some of the entries in the switch are common to both bytecode sets. These have (generated) C source looking like:

while (1) {
VM_LABEL(Dispatch);
switch (currentBytecode) {
CASE(0)
CASE(256)
/* pushReceiverVariableBytecode */
{
sqInt object;

VM_LABEL(pushReceiverVariable);
/* begin fetchNextBytecode /
currentBytecode = (byteAtPointer(++localIP)) + bytecodeSetSelector;
/
begin pushReceiverVariable: */
object = longAt(((longAt(localFP + FoxReceiver)) + BaseHeaderSize));
longAtPointerput((localSP -= BytesPerOop), object);
}
BREAK;

CASE(1)
CASE(257)
/* pushReceiverVariableBytecode */
{
sqInt object;

VM_LABEL(pushReceiverVariable1);
/* begin fetchNextBytecode /
currentBytecode = (byteAtPointer(++localIP)) + bytecodeSetSelector;
/
begin pushReceiverVariable: /
object = longAt(((longAt(localFP + FoxReceiver)) + BaseHeaderSize) + 8 /
(currentBytecode bitAnd: 15) << self shiftForWord */);
longAtPointerput((localSP -= BytesPerOop), object);
}
BREAK;
etc

CASE and BREAK are macros which allow for gcc’s first-class labels to be used to speed up dispatch. VM_LABEL is a macro which is used to insert a global label so that in a profiler one can see each bytecode separately, rather than have all of interpret as one big lump.

Alas using -Os clang on MacOS (Apple LLVM version 10.0.0 (clang-1000.11.45.5)) is choosing to split the common code in the switch (in the above into a case for 0 and a separate case for 256, a case for 1 and a case for 257, etc). This breaks the use of VM_LABEL because the labels _LpushReceiverVariable, _LpushReceiverVariable1, et al that get inserted by the VM_LABEL macro each get duplicated when the shared code in the case is duplicated. So if VM_LABEL is defined to insert a label compilation fails with many errors about many label duplications.

Is there a way to turn off the duplication? I’ve tried a few switches and looked at the voluminous -foptimization-record-file=optimizations output, but can’t find any clues. -fno-reroll-loops for example had no effect.

I note that given the use of -Os, which is optimize for space as well as speed, the duplication in the switch is exactly the opposite response to the flag a developer wants. I know that keeping these cases together will give me better icache density and improved performance, etc.

Is there a way to generate assembly before attempting to generate code? This would give me the ability to remove the duplicated labels by editing the generated assembler. Currently asking for -S generates nothing because the code is being compiled to machine code before assembler is generated. Inconvenient for those of us used to be able to abuse the output of the compiler for nefarious means (which led to the invention of gcc’s first class labels, but that’s an old and long story).

and thank you for an otherwise glorious compiler.