MachineSink and EFLAGS

Hello.

I am not sure this is the right place to ask but here is my question. About a year ago there was a fix of some obscure bug (rdar://problem/8030636 which is located on the internal Apple bugtracker I believe and so not available to the general public J)

Some discussion can be found here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100531/102160.html. Unfortunately, no real testcase is provided, just abstract scenario description.

Basically a EFLAGS-clobbering instruction is not sunk if EFLAGS might be live out of the current block and since a conditional branch doesn’t mark its EFLAGS use as a kill, that effectively means it is never sunk. But how can that happen? I think the only way for it is when EFLAGS def and use are located in different blocks but that is impossible with current instruction selector. Or am I wrong?

A little later another fix has been made (http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20100614/102554.html) which restricts the problem only to cases when the blocks are produced by lowering a SELECT instruction. It is even more confusing. How is it different from the other scenario wrt. EFLAGS use? Moreover, pseudo cmov is marked as clobbering EFLAGS so how can it be used further in the code?

Are there plans to implement a less conservative fix? I believe proper phys regs liveness information is required for that.

Thanks,

Sergey

Hi Sergey,

I was the one who created the patch. Unfortunately, there wasn't a testcase that I could use that was small and wouldn't require modification after every compiler change.

Looking at the bug report, I've attached the testcase below. You can compile it like this:

$ llvm-gcc -Os -fmessage-length=0 -pipe -std=gnu99 -fpascal-strings -fasm-blocks -pedantic -msse3 test.c -o test.s -S

See the attached test.s for comments on what is wrong with the output.

You're correct that proper physical register liveness information is required for a less conservative fix. However, I don't know if that will ever happen. If it does, then we can make this pass more liberal.

-bw

test.c (2.48 KB)

test.s (8.43 KB)

Hi Bill.

Thank you very much! Now I see my understanding was incorrect :slight_smile: A dependence from a single physreg-defining instruction (like CMP or TEST) is allowed to be shared in several instructions unless that register is not clobbered (and this is what we have with CMOV_FR64). Wouldn't it be safe then to not set the live-in flag in EmitLoweredSelect for instructions which are marked as defining EFLAGS (like the integer pseudo cmovs)?

Thanks,
Sergey

Hi Sergey,

I'm not certain what that would buy us. If I understand your suggestion correctly, not setting the EFLAGS as live-in to a basic block would mean that we'd have to execute the instruction to reset the EFLAGS register, right? That would mean more code, longer live ranges, etc.

-bw

To clarify, a physical register may be defined in one basic block and used in another. In that case, it must be marked as live-in in all basic blocks entered on any path between the def and the use.

This is effectively manual register allocation, and it should be avoided. It is necessary sometimes, though. Like in select lowering.

The machine code verifier has good checks for this. If it complains about missing live-in registers, miscompilations are possible.

/jakob

Hi, Bill and Jakob.

I don't quite understand. I am talking about CMOV_GR* instructions which are conservatively marked as clobbering EFLAGS in X86InstrCompiler.td. Doesn't that mean there cannot be any use of EFLAGS in subsequent instructions before it is defined by some other instruction?

I also don't understand the remark about resetting EFLAGS. What kind of reset is meant? In case of normal branch instructions there is no specific reset instruction and EFLAGS is not marked live-in in successor blocks. How is that different from our case?

Hi, Bill and Jakob.

I don't quite understand. I am talking about CMOV_GR* instructions which are conservatively marked as clobbering EFLAGS in X86InstrCompiler.td. Doesn't that mean there cannot be any use of EFLAGS in subsequent instructions before it is defined by some other instruction?

I also don't understand the remark about resetting EFLAGS. What kind of reset is meant? In case of normal branch instructions there is no specific reset instruction and EFLAGS is not marked live-in in successor blocks. How is that different from our case?

Perhaps it was a misunderstanding. :slight_smile: Anyway, if an instruction (CMOV_GR* in this case) clobbers the value in EFLAGS, this means that it is setting that value. In which case there can be instructions further down the line which use EFLAGS (they use the EFLAGS from the CMOV_GR*, right?).

To avoid further confusion, could you give an example of where you think it shouldn't be marked as live-in? (You can use 'llc -print-machineinstrs ...' to get the machine instruction dump.)

-bw

Well, the point is CMOV_GR* are marked clobbering EFLAGS conservatively just in case they turn out to be lowered into a sequence containing XOR %reg,%reg which indeed clobbers EFLAGS. This means there might not be any instruction which actually uses this EFLAGS value.

For an example we can look no further than the actual test which has been disabled after the fix (llvm/test/Codegen/X86/sink-hoist.ll, function zzz).

BB#0: derived from LLVM BB %entry
  %vreg0<def> = MOV8rm <fi#-2>, 1, %noreg, 0, %noreg; mem:LD1[FixedStack-2](align=4) GR8:%vreg0
  %vreg1<def> = AND8ri %vreg0, 127, %EFLAGS<imp-def,dead>; GR8:%vreg1,%vreg0
  %vreg2<def> = OR8ri %vreg0, -128, %EFLAGS<imp-def,dead>; GR8:%vreg2,%vreg0
  CMP8mi <fi#-1>, 1, %noreg, 0, %noreg, 0, %EFLAGS<imp-def>; mem:LD1[FixedStack-1](align=8)
  %vreg3<def> = CMOV_GR8 %vreg2<kill>, %vreg1<kill>, 4, %EFLAGS<imp-def,dead>, %EFLAGS<imp-use>; GR8:%vreg3,%vreg2,%vreg1
  %vreg4<def> = MOVZX32rr8 %vreg3<kill>; GR32:%vreg4 GR8:%vreg3
  %EAX<def> = COPY %vreg4; GR32:%vreg4
  RET

CMOV_GR8 instruction has EFLAGS use which is not marked kill, and that's why EFLAGS is marked live-in in the new blocks created when lowering the pseudo cmov. But besides that it also has an imp-def of EFLAGS marked dead. How can an instruction use and define the same register with def being dead and use not being killed? I believe this is happening just because proper liveness information is not calculated at that point, but anyway we can treat EFLAGS as if it was marked killed in this case, can't we?

So this is precisely my proposal: don't mark EFLAGS live-in in the new blocks in EmitLoweredSelect if EFLAGS is a dead def of the pseudo cmov.

Thanks for spelling it out, now I understand.

Well, the point is CMOV_GR* are marked clobbering EFLAGS conservatively just in case they turn out to be lowered into a sequence containing XOR %reg,%reg which indeed clobbers EFLAGS. This means there might not be any instruction which actually uses this EFLAGS value.

This actually looks like a leftover from some older code. EmitLoweredSelect() will never insert an xor, it just creates a branch+phi instruction. It is true that a 0 phi argument could eventually be rematerialized as an xor instruction, but X86InstrInfo::reMaterialize() checks checks for this and uses a mov instead if EFLAGS is live.

For an example we can look no further than the actual test which has been disabled after the fix (llvm/test/Codegen/X86/sink-hoist.ll, function zzz).

BB#0: derived from LLVM BB %entry
  %vreg0<def> = MOV8rm <fi#-2>, 1, %noreg, 0, %noreg; mem:LD1[FixedStack-2](align=4) GR8:%vreg0
  %vreg1<def> = AND8ri %vreg0, 127, %EFLAGS<imp-def,dead>; GR8:%vreg1,%vreg0
  %vreg2<def> = OR8ri %vreg0, -128, %EFLAGS<imp-def,dead>; GR8:%vreg2,%vreg0
  CMP8mi <fi#-1>, 1, %noreg, 0, %noreg, 0, %EFLAGS<imp-def>; mem:LD1[FixedStack-1](align=8)
  %vreg3<def> = CMOV_GR8 %vreg2<kill>, %vreg1<kill>, 4, %EFLAGS<imp-def,dead>, %EFLAGS<imp-use>; GR8:%vreg3,%vreg2,%vreg1
  %vreg4<def> = MOVZX32rr8 %vreg3<kill>; GR32:%vreg4 GR8:%vreg3
  %EAX<def> = COPY %vreg4; GR32:%vreg4
  RET

CMOV_GR8 instruction has EFLAGS use which is not marked kill, and that's why EFLAGS is marked live-in in the new blocks created when lowering the pseudo cmov. But besides that it also has an imp-def of EFLAGS marked dead. How can an instruction use and define the same register with def being dead and use not being killed? I believe this is happening just because proper liveness information is not calculated at that point, but anyway we can treat EFLAGS as if it was marked killed in this case, can't we?

You are exactly right.

So this is precisely my proposal: don't mark EFLAGS live-in in the new blocks in EmitLoweredSelect if EFLAGS is a dead def of the pseudo cmov.

That would work. Alternatively, don't mark the CMOV pseudos as clobbering EFLAGS, because they really don't. This would give the scheduler more freedom when the flags have multiple users.

We only have primitive liveness information before register allocation, so sometimes kill flags are missing. If you remove the EFLAGS clobber, you should check if the kill flag appears. If it doesn't, try to find out why it is missing.

Thanks for looking into this,
/jakob

Thank you very much for the explanation! That's what I wanted to know :slight_smile: