Trying to optimize small snippet

JesseJohnson · December 6, 2023, 6:39pm

I’ve got this snipped:

  %bool.fast.i.i = select i1 %cmp.i.i, { i64, ptr } { i64 2, ptr null }, { i64, ptr } { i64 1, ptr null }
  %0 = extractvalue { i64, ptr } %bool.fast.i.i, 0
  %null.i22.i = icmp eq i64 %0, 0
  br i1 %null.i22.i, label %call.next.i, label %call.exit.i

It seems like LLVM should be able to optimize away the icmp and conditional br but I can’t seem to get it to happen. I’ve thrown O3 at it, and sroa and a bunch of other passes. Is there something here I’m missing, or do I need a specific analysis / transform?

efriedma-quic · December 6, 2023, 8:39pm

I wouldn’t expect a select with a struct result type to optimize well, in general; it’s legal, but nothing in clang/LLVM normally generates that construct, so optimizations don’t really know how to handle it. If this is coming out of your frontend, I’d suggest changing the way you generate code.

Maybe we could teach instcombine to split selects like this (so instead of one select with an { i64, ptr } result, we have two selects: one with an i64 result, and one with a ptr result).

JesseJohnson · December 6, 2023, 10:44pm

Of course. I would expect a pass to transform that into 2 selects. I would think SROA would do this, but does SROA not work when an alloca isn’t present? Or on constants?

JesseJohnson · December 6, 2023, 10:57pm

The source was a hand written LLVM function with a select struct that had been inlined. I’m now going through the code looking for that pattern (select i1 struct) to replace. Still, it was unexpected.

efriedma-quic · December 7, 2023, 4:14am

SROA is, at it’s core, does just two things: spliting allocas into multiple pieces, and SSA construction on allocas. It doesn’t handle anything that isn’t an alloca.

I think instcombine has some transforms to split structs passed as values, but maybe just for load/store? I don’t recall exactly.

JesseJohnson · December 7, 2023, 11:50pm

Thanks. That is useful. It would be nice to be able to have SROA on selects without allocas. The replacement I used turns 1 line of IR into (4 * number of struct fields), 8 in the case of 2. That’s a lot of extra IR, though in many cases it can be optimized away.

Topic		Replies	Views
Optimization issues (Alias Analysis?) LLVM Dev List Archives	3	63	July 6, 2016
simple optimization question LLVM Dev List Archives	2	91	January 18, 2013
Optimizing pass-by-value structs for le64 target LLVM Dev List Archives	1	80	July 4, 2019
Missing Optimization Opportunities LLVM Dev List Archives	1	62	September 10, 2010
InstCombine's select optimizations don't trigger reliably IR & Optimizations	2	223	August 30, 2022

Trying to optimize small snippet

Related Topics