How to distinguish between padding and real struct fields?

Lqs66 · April 11, 2024, 7:23am

In some cases, the structure/class definitions in LLVM IR have some byte array padding, and I want to distinguish between them, how can I do it?

For example the following two examples:
1.

struct A
{
    int a;
    char b;
    long c;
} __attribute__((packed, aligned(4)));

struct A a;

%struct.A = type <{ i32, i8, i64, [3 x i8] }>
@a = dso_local global %struct.A zeroinitializer, align 4

struct A
{
    int a;
    char b;
    long c;
};

struct A a;
class B : A{
 char a;
};

B b;

%struct.A = type { i32, i8, i64 }
%class.B = type <{ %struct.A, i8, [7 x i8] }>

@a = dso_local global %struct.A zeroinitializer, align 8
@b = dso_local global %class.B zeroinitializer, align 8

pogo59 · April 11, 2024, 1:46pm

I suspect there is no reliable way to identify padding at the IR level. You could try looking for bytes that are not accessed, but that won’t identify everything (accesses may be widened, copying a whole struct may copy the padding, etc).

I expect the only way to robustly identify padding is in the frontend, RecordLayoutBuilder.

Lqs66 · April 11, 2024, 2:14pm

Thank you for your reply.
Do you mean I need to modify the frontend of clang to get information about paddings?
How do I get the information and bind it to the IR?

pogo59 · April 11, 2024, 3:18pm

Someone more familiar with the frontend would have to answer these questions.

Once you know where the padding is, what do you plan to do with that knowledge? If you explained your goals it might be easier to help.

Lqs66 · April 11, 2024, 4:00pm

Thanks for your reply, I would like to get the mapping between C++ source class members and IR class definitions.

struct A
{
    int a;
    char b;
    long c;
};

%struct.A = type <{ i32, i8, i64, [3 x i8] }>

For example, a corresponds to i32, b corresponds to i8, c corresponds to i64.

pogo59 · April 11, 2024, 5:53pm

The debug-info metadata can provide that kind of information, although maybe not as directly as you would prefer.

@a = global %struct.A zeroinitializer, align 4, !dbg !0

!0 = !DIGlobalVariableExpression(var: !1, expr: !DIExpression())
!1 = distinct !DIGlobalVariable(name: "a", scope: !2, file: !3, line: 6, type: !5, isLocal: false, isDefinition: true)
!2 = distinct !DICompileUnit(language: DW_LANG_C11, file: !3, producer: "clang version 17.0.6 (PS5 clang version 9.00.0.501 cdbd5f6a)", isOptimized: false, runtimeVersion: 0, emissionKind: FullDebug, globals: !4, splitDebugInlining: false, debugInfoForProfiling: true, nameTableKind: None)
!3 = !DIFile(filename: "padding.c", directory: "D:\\Dev\\ours\\scratch", checksumkind: CSK_MD5, checksum: "766809f898304fd25444f40ce70e17fa")
!4 = !{!0}
!5 = distinct !DICompositeType(tag: DW_TAG_structure_type, name: "A", file: !3, line: 2, size: 128, align: 32, elements: !6)
!6 = !{!7, !9, !11}
!7 = !DIDerivedType(tag: DW_TAG_member, name: "a", scope: !5, file: !3, line: 3, baseType: !8, size: 32)
!8 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!9 = !DIDerivedType(tag: DW_TAG_member, name: "b", scope: !5, file: !3, line: 4, baseType: !10, size: 8, offset: 32)
!10 = !DIBasicType(name: "char", size: 8, encoding: DW_ATE_signed_char)
!11 = !DIDerivedType(tag: DW_TAG_member, name: "c", scope: !5, file: !3, line: 5, baseType: !12, size: 64, offset: 40)
!12 = !DIBasicType(name: "long", size: 64, encoding: DW_ATE_signed)

Variable @a points to the debug-info description at !0. This points to the variable at !1, which has its type at !5, which points to the list of members, and so on.

Sizes and offsets of members are in bits. There are no member descriptions for padding, so you can derive the size and location of padding bits by what parts of the struct are not covered by members.

I am not deeply familiar with the APIs for navigating the debug info. If I were working on a project like this, I’d probably look first at the IR verifier to see how it walks the tree of debug-info metadata.

Topic		Replies	Views
Struct padding LLVM Dev List Archives	8	113	May 18, 2017
How to tell if a class contains tail padding? LLVM Dev List Archives	4	89	July 23, 2021
Do I need to calculate padding by myself to construct a StructType ? LLVM Dev List Archives	1	91	November 9, 2012
different layout of structs for llc vs. llvm-gcc LLVM Dev List Archives	11	83	July 14, 2010
Structure Types and ABI sizes LLVM Dev List Archives	9	104	February 16, 2011

How to distinguish between padding and real struct fields?

Related Topics