GSoC2019 - DebugInfo should not effect codegen

Hi all,

I'm interested in participating this GSoC with llvm. I would like to contribute on the project idea "DebugInfo should not effect codegen". Over the past few days, I have gone through some bugs reported at bugzilla that cause different codegen behavior of same program.

Bugs I have been looking at are:

1. [[fuzzDI] -O1 + -g cause the generated code to change.](https://bugs.llvm.org/show_bug.cgi?id=37306)

2. [Make llvm passes debug info invariant](https://bugs.llvm.org/show_bug.cgi?id=37728)

3. [LoopVectorize: different IR generation when debug info is present](https://bugs.llvm.org/show_bug.cgi?id=37727)

4. [InstCombine: test/Transforms/InstCombine/call-guard.ll looks different with debug info present](https://bugs.llvm.org/show_bug.cgi?id=37714)

I'm currently working on finding the root cause of these issues. I have some questions regarding the scope of the project.

   1) Whether the scope is finding as many issue as possible and fix it or whether I can design some protocol like stuff which impose constrains on optimization passes to interact with debug
      info metadata, by this way we can far more generalize this idea and make a clear distinction between opt passes which operate on llvm ir & debug metadata. 

I highly appreciate your thoughts on it.
Thanks

I don't want to discourage you from trying but, having spent a fair
amount of time looking at these issues and reporting bugs I would like
to point out that:
1) There's probably enough work for a summer (or a year, or more) just
fixing the existing issues. You don't need to have anything extremely
sophisticated. After you're done with the bugs reported you can use
`-debugify` as Vedant did in some PRs he reported. Or, in alternative,
you can just run `csmith` or any C/C++ code generator out there and
compare the output with and without `-g`. Mind you, the number of
reports might be a overwhelming :slight_smile:
2) "designing a protocol like stuff to impose constraints on
optimizations" is definitely a laudable goal, but it's really hard to
achieve in practice. We have a verifier to make sure debug
informations don't mess with debug informations, but making sure that
DI is correct is a much harder problem to solve. Also, if we decide to
go that route, it might end up requiring substantial surgery at the IR
level, which might not be feasible in a single summer.

Best,

Hi David,

Thanks a lot for your key insights, it really helped me to understand the complexity of this and effort that has taken by the community. The protocol design it was just an idea, i didn't thought of "how difficult it will be", of course, I agree with you that we want to have clear set of tasks that can be done within 3 months.