[SPIR-V] Uses outside the selection region

Currently when deserializing a SPIR-V binary into MLIR the entire basic block associated with the selection op ends up being sunk into a spirv.mlir.selection, and it is treated as a header block of the region. However, this approach is too aggressive as it may result in variables that are used outside the selection region being sunk. As a result the rest of the code loses visibility of that variable and it causes an error, for example:

<unknown>:0: error: 'spirv.Variable' op failed control flow structurization: it has uses outside of the enclosing selection/loop construct
<unknown>:0: note: see current operation: %0 = "spirv.Variable"() <{storage_class = #spirv.storage_class<Function>}> : () -> !spirv.ptr<vec
tor<3xf32>, Function>

Example of such shader here.

Now I’m wondering what’s the best way to handle this. Personally, I came up with an idea to split conditional basic blocks (with some restrictions) to move a conditional branch into an individual block. Then, the sinking logic inside the deserializer will only sink the conditional check and leave the rest of the basic block outside the selection region. For example (in pseudo code):

label_0:
    inst_0
    inst_1
    inst_2
    branch_conditional ...

Becomes:

label_0:
    inst_0
    inst_1
    inst_2
    branch_always label_1
label_1:
    branch_conditional ...

I cannot see any immediate issues with such approach, and it is fairly high-level – modifying blocks instead of individual instructions – and it doesn’t not modify the sinking logic directly. The splitting may create some superfluous block when roundtripping mlir through mlir-translate, but that only impacts tests. I have implemented it in here.

Then, I came across a draft PR addressing the same issue, but differently here. If I understand correctly, the author pushes VariableOp outside the selection region after the block is sunk to avoid the problem. I can also see it being a good solution.

So, I am wondering whether anyone has any thoughts on what the best approach is here. I don’t want to create another PR solving the same problem, so I thought having a discussion here first will be a good idea, and then the best solution can be worked on until completion.