getting code ranges of multiple blocks and prevent reordering?

Jay_K1 · May 29, 2018, 7:52pm

Hi. I'm very new to LLVM.

For reasons to do with custom exception handling, we have a need to check IP/PC at runtime against code ranges. This can encompass multiple logically adjacent blocks.

How to do this?

I'm guessing:
insert a label at end of every block, takes it address, store that somewhere in our data; preferably as an offset from module or function start, but full address with relocation would work
take address of start of every block, similarly
That is not allowed as I understand for function-entry block, so add an extra dummy block as entry, that branches to actual start.

And then, somehow, ask LLVM not to reorder anything, with functions with such labels?

Does this make sense and can anyone fill in details?

Also, of course some reordering would be ok, as long as start of first block is first, and end of last block is last. The order of the blocks between doesn't matter.

I realize, there is another approach, something like assigning our own numbers to scopes, have a local volatile integer, that we assign as we enter/leave scopes. i.e. "NT/x86 EH style". But that is a bigger and less efficient change compared to what we have (a non-LLVM less-optimizing codegen that lets us do what I describe).

Thank you,
- Jay

rnk · May 30, 2018, 6:00pm

The short answer is, no, there is no way to just ask LLVM not to reorder things. There isn’t a good principled way to do this, because it relies on the assumption that LLVM will lay the code out the same way that you gave it do it.

At source level, the user usually writes something like this establishing a scoped try region:

try {
*p = 42;
if (p)
a = fp_x / fp_y;
if (a)
might_throw(p, a);
} catch (…) {
}

In LLVM, the backend could easily decide to move that conditional code out of line to the end of the function if its heuristics identify it as cold code. That means that it’s not sufficient to do things like emitting labels in inline assembly at the try region start and try region end, or the equivalent with new intrinsics.

To do this reliably, you need to insert labels around every potentially throwing operation, which is what CodeGen does for the invoke LLVM IR instruction. Then, after code layout is done, you can walk over the code to find the labels you inserted before, and coalesce consecutive regions of potentially throwing instructions. Typically, you build a table called an LSDA, which contains relocations against your labels.

You can look at the code in llvm/lib/CodeGen/AsmPrinter/EHStreamer.cpp, WinException.cpp, and DwarfCFIException.cpp to find code that implements this logic.

Jay_K1 · May 30, 2018, 9:41pm

the assumption that LLVM will lay the code out

In other project, using gcc, we have:

flag_reorder_blocks = false; /* breaks our exception handling? /
flag_reorder_blocks_and_partition = false; / breaks our exception handling? */

It might be nice to have that in LLVM.
It appears gcc actually uses these itself for similar reasons.
i.e. doesn’t allow those optimizations when supporting exceptions.

to the end of the function

Or even further away, i.e. cold page not just cold cache line.
Windows supports that at least with “chained” unwind.

Otherwise, understood.

Thank you,

Jay

Topic		Replies	Views
LLVM IR as code Beginners llvm	0	156	February 3, 2024
basic block reordeing in memory LLVM Dev List Archives	1	56	November 25, 2014
how to prevent LLVM back-end from reordering instructions at instruction scheduling? LLVM Dev List Archives	9	106	November 16, 2016
Changing basic blocks LLVM Dev List Archives	10	86	August 17, 2007
Can I get the binary address of a for-loop statement? LLVM Dev List Archives	5	69	May 12, 2011

getting code ranges of multiple blocks and prevent reordering?

Related Topics