[RFC] Add DebugLoc parameter in Instruction’s Create() functions

Many DI-related bugs are caused by missing Debug Location
in an instruction created in a transformation. Most of the
time the fix is trivial once you found where the culprit
instruction is created (https://reviews.llvm.org/D50263).
Currently, when you create a new Instruction, in order to
give it DL you have to either use an IRBuilder that is
previously set to the correct DL or “manually” create
the instruction via one of it’s Create() routines and then
call `setDebugLoc()` to it.

I propose the addition of a DebugLoc parameter
in the *::Create() instruction constructors.
This could be in the form of a pure DebugLoc
variable or possibly an `Instruction *InheritLocationFromInst` one

Some pros of this idea are:
- Easier to create instructions with debug location.
- It will make the code more readable by eliminating
the many `NewInst->setDebugLoc(OldInst->getDebugLoc)` calls.
- It will incentivize people to think about the DebugLoc of
their newly created instruction and where it should come from.

As the Create() functions are widely used, and sometimes
location data could be out of scope when the Instruction
is created, I think the best approach to this is to
introduce the new field as optional that defaults to
empty DL (as it is now) to avoid huge refactoring,
and to allow more time for testing individual changes.
Refactoring could then be done in steps and the
parameter could become mandatory in some ::Create()
constructors as we judge fit.

With that in mind here are some cons:
- Incomplete transition of APIs.
- Lack of infrastructure to check that the locations
supplied to *::Create are correct.

What do you think?

Many DI-related bugs are caused by missing Debug Location
in an instruction created in a transformation. Most of the
time the fix is trivial once you found where the culprit
instruction is created (⚙ D50263 [TRE][DebugInfo] Preserve Debug Location in new branch instruction).
Currently, when you create a new Instruction, in order to
give it DL you have to either use an IRBuilder that is
previously set to the correct DL or “manually” create
the instruction via one of it’s Create() routines and then
call `setDebugLoc()` to it.

I propose the addition of a DebugLoc parameter
in the *::Create() instruction constructors.
This could be in the form of a pure DebugLoc
variable or possibly an `Instruction *InheritLocationFromInst` one

Some pros of this idea are:
- Easier to create instructions with debug location.
- It will make the code more readable by eliminating
the many `NewInst->setDebugLoc(OldInst->getDebugLoc)` calls.
- It will incentivize people to think about the DebugLoc of
their newly created instruction and where it should come from.

As the Create() functions are widely used, and sometimes
location data could be out of scope when the Instruction
is created, I think the best approach to this is to
introduce the new field as optional that defaults to
empty DL (as it is now) to avoid huge refactoring,
and to allow more time for testing individual changes.
Refactoring could then be done in steps and the
parameter could become mandatory in some ::Create()
constructors as we judge fit.

I generally agree with the sentiment that the API should make is easier to do the right thing, but I wonder if making the DILocation an optional<> doesn't undermine this effort.

In the Swift compiler we recently had a very similar discussion about the right interface for building its SIL intermediate representation (which is conceptually very similar to LLVM IR, just at a higher level of abstraction). One idea that was put forward was that for different use-cases we need different interfaces. For example, for a frontend like Clang, the interface where you set the DILocation once and then generate sequence of instructions for that location makes a lot of sense. For a transformation, that expands one instruction into a lowered sequence of instructions, this would also be fine. But for transformations that clone instructions into multiple places an interface like the one you are proposing makes more sense. Even for transformations such as inlining that need to transform each DILocation your proposed interface would be a good fit.

In short what I want to say is that it might make sense to have two interfaces to IRBuilder that are tailored towards the specific needs of the very different use-cases (expanding, cloning) that IRBuilder is used for.

-- adrian

Might be a lot of functions to change - maybe a wrapper would be easier:

template
T *applyDebugLoc(T *t, DebugLoc L) { t->setDebugLoc(L); return t; }

Or change/improve/modify/add setDebugLoc to return the Instruction* so it could be chained? (NewInst = Foo::Create(…).applyDebugLoc(L); )

Many DI-related bugs are caused by missing Debug Location
in an instruction created in a transformation. Most of the
time the fix is trivial once you found where the culprit
instruction is created (https://reviews.llvm.org/D50263).
Currently, when you create a new Instruction, in order to
give it DL you have to either use an IRBuilder that is
previously set to the correct DL or “manually” create
the instruction via one of it’s Create() routines and then
call setDebugLoc() to it.

I propose the addition of a DebugLoc parameter
in the *::Create() instruction constructors.
This could be in the form of a pure DebugLoc
variable or possibly an Instruction *InheritLocationFromInst one

Some pros of this idea are:

  • Easier to create instructions with debug location.
  • It will make the code more readable by eliminating
    the many NewInst->setDebugLoc(OldInst->getDebugLoc) calls.
  • It will incentivize people to think about the DebugLoc of
    their newly created instruction and where it should come from.

As the Create() functions are widely used, and sometimes
location data could be out of scope when the Instruction
is created, I think the best approach to this is to
introduce the new field as optional that defaults to
empty DL (as it is now) to avoid huge refactoring,
and to allow more time for testing individual changes.
Refactoring could then be done in steps and the
parameter could become mandatory in some ::Create()
constructors as we judge fit.

I generally agree with the sentiment that the API should make is easier to do the right thing, but I wonder if making the DILocation an optional<> doesn’t undermine this effort.

How would it do that?

In the Swift compiler we recently had a very similar discussion about the right interface for building its SIL intermediate representation (which is conceptually very similar to LLVM IR, just at a higher level of abstraction). One idea that was put forward was that for different use-cases we need different interfaces. For example, for a frontend like Clang, the interface where you set the DILocation once and then generate sequence of instructions for that location makes a lot of sense. For a transformation, that expands one instruction into a lowered sequence of instructions, this would also be fine. But for transformations that clone instructions into multiple places an interface like the one you are proposing makes more sense. Even for transformations such as inlining that need to transform each DILocation your proposed interface would be a good fit.

In short what I want to say is that it might make sense to have two interfaces to IRBuilder that are tailored towards the specific needs of the very different use-cases (expanding, cloning) that IRBuilder is used for.

Is the suggestion here to add debug location parameters to every method in IRBuilder? How does this tie back to changing the way debug locations are applied wherever we use static Instruction constructors?

vedant