GSoC project on finding and fixing bugs with debug info

Hi,

I would like to learn more about the GSoC project to find and fix bugs that changes the generated code when it is compiled with the debug info option (http://llvm.org/OpenProjects.html#gsoc17). I have built llvm with clang and I compiled the hello.c program found in http://llvm.org/docs/GettingStarted.html with and without the -g flag. I disassembled both using llvm-dis and took the diff of them. The diff file does show differences (see attached) and as I understand this shouldn’t be the case.

I would like to know the following about this project.

  • What is the scope of this project? Are the bugs that cause these differences and how many of them there are, known? If not what should be the deliverable of the project?
  • Which are the test programs that can be used to discover these bugs?

Thanks in advance!

diff.txt (1.34 KB)

The llvm-dis tool will disassemble a file into LLVM IR. Clang normally compiles to target machine instructions. You want to disassemble the compiled object into machine instructions for your target. llvm-objdump can do this, or the GNU tools objdump and (if your target uses ELF) readelf. We are specifically looking for differences in the machine instructions, in the ‘.text’ sections.

There may be one or two existing bug reports in this area, for example bug 22344. But the main point of the project is to find new examples, and either reduce the test cases to reasonable size and report them as new bugs, or fix them.

I believe the primary value of this project is that you will be forced to learn to understand the behavior of many different parts of the compiler. Differences that are triggered by debug info might start anywhere. It could begin in an IR optimization pass that does not account for debug information properly; it could be in some machine-function pass, which might be target-independent or target-specific. The nature of how LLVM encodes some debug information, by introducing IR instructions and machine-IR instructions, means that these differences could arise nearly anywhere.

I think the project description notes that there is a reasonably large body of source code in the LLVM ‘test-suite’ project. These programs could be used as a starting point in the search for differences introduced by –g. But, probably any reasonable sized project that you can build using clang would provide you with some examples.

Please write back if you have additional questions. I am very interested in having someone take on this project, although for personal reasons I do not think I can mentor someone this summer.

–paulr