I would like to implement macro debug info support in LLVM.
Below you will find 4 parts:
Background on what does it mean to debug macros.
A brief explanation on how to represent macro debug info in DWARF 4.0.
The suggested design.
A full example: Source → AST → LLVM IR → DWARF.
Feel free to skip first two parts if you think you know the background.
Please, let me know if you have any comment or feedback on this approach.
There are two kind of macro definition:
Simple macro definition, e.g. #define M1 Value1
Function macro definition, e.g. #define M2(x, y) (x) + (y)
Macro scope starts with the “#define” directive and ends with “#undef” directive.
GDB supports debugging macros. This means, it can evaluate the macro expression for all macros, which have a scope that interleaves with the current breakpoint.
GDB command: print M2(3, 5)
GDB Result: 8
GDB can evaluate the macro expression based on the “.debug_macroinfo” section (DWARF 4.0).
[DWARF 4.0 “.debug_macroinfo” section]
In this section there are 4 kinds of entries
Note: There is a 5th kind of entry for vendor specific macro information, that we do not need to support.
The first two entries contain information about the line number where the macro is defined/undefined, and a null terminated string, which contain the macro name (followed by the replacement value in case of a definition, or a list of parameters then the replacement value in case of function macro definition).
The third entry contains information about the line where the file was included followed by the file id (an offset into the files table in the debug line section).
The fourth entry contains nothing, and it just close the previous entry of third kind (start_file) .
Macro definition and file including entries must appear at the same order as they appear in the source file. Where all macro entries between “start_file” and “end_file” entries represent macros appears directly/indirectly in the included file.
The main source file should be the first “start_file” entry in the sequence, and should have line number “0”.
Command line/Compiler definitions must also have line number “0” but must appear before the first “start_file” entry.
Command line include files, must also have line number “0” but will appear straight after the “start_file” of the main source.
To support macros the following components need to be modified: Clang, LLVM IR, Dwarf Debug emitter.
In clang, we need to handle these source directives:
The idea is to make a use of “PPCallbacks” class, which allows preprocessor to notify the parser each time one of the above directives occurs.
These are the callbacks that should be implemented:
“MacroDefined”, “MacroUndefined”, “FileChanged”, and “InclusionDirective”.
AST will be extended to support two new DECL types: “MacroDecl” and “FileIncludeDecl”.
Where “FileIncludeDecl” AST might contain other “FileIncludeDecl”/“MacroDecl” ASTs.
These two new AST DECLs are not part of TranslationUnitDecl and are handled separately (see AST example below).
In the LLVM IR, metadata debug info will be extended to support new DIs as well:
“DIMacro”, “DIFileInclude”, and “MacroNode”.
The last, is needed as we cannot use DINode as a base class of “DIMacro” and DIFileInclude" nodes.
DIMacro will contain:
· type (definition/undefinition).
· line number (interger).
· name (null terminated string).
· replacement value (null terminated string - optional).
DIFileMacro will contain:
· line number (interger).
· file (DIFile).
· macro list (MacroNodeArray) - optional.
In addition, the DICompileUnit will contain a new optional field of macro list of type (MacroNodeArray).
Finally, I assume that macro support should be disabled by default, and there should be a flag to enable this feature. I would say that we should introduce a new specific flag, e.g. “-gmacro”, that could be used with “-g”.
Here is an example that demonstrate the macro support from Source->AST->LLVM IR->DWARF.