Hello,
I am currently writing my Master's Thesis on a topic regarding the analysis of memory safety and termination of LLVM programs.
Cool. Have you checked out the Memory Safety Menagerie? Memory Safety Menagerie
This includes alignments in LLVM IR, but I am not sure if I understand their semantics correctly. I have written a program (see attachment) which uses the instruction
store i32 1, i32* %7, align 4
to store an integer at an address that I forced to be uneven, and compiled it with clang. The result is that the integer is stored exactly there, which I expected for alignment 1 but not for alignment 4. Changing the alignment to any other size does not have any effect.
This leads to my questions:
- Do alignments provide additional semantics to be obeyed by the compiler or are they just hints that can be ignored?
According to the language reference manual, having a dynamic pointer value that isn't aligned at the specified alignment is undefined behavior. The alignment is designed as a hint to the code generator that it can assume that the address will be at the specified alignment so that it can generate more efficient code. This is useful on processors that have different memory access instructions for different word sizes (I think ARM is an example; I am sure there are others).
- What is the semantics of alignments from the perspective of an IR analyzer as opposed to an IR emitter?
From your perspective, there's undefined behavior if the address in the load or store isn't aligned at the proper boundary. That said, you can probably cheat a little bit and look at what the code generator will do. On x86, for example, the alignment probably doesn't matter as nearly all x86 memory access instructions can access any alignment; the memory access may simply be slower than necessary.
- Can you give me an example where wrong alignments lead to undefined behavior?
On some processors, there are different memory access instructions for accessing memory of difference sizes and alignments. Using an address that isn't aligned properly would cause a fault. I think ARM does this.
Regards,
John Criswell