AARCH64 Code Size regression between 6/7

I am developing in C for an extremely memory constrained AARCH64 embedded environment. Sometime between llvm 6 and 7, I’m seeing a code size regression when I make multiple accesses into a global struct. Specifically, I have functions that perform several reads/writes into this global struct.

In older versions (5/6)

  • a single ADRP/ADD combo is issued at the beginning of a function to get my structure address into a register
  • that register is preserved throughout the function
  • subsequent accesses into this structure are done as LDR/STR with offset from the preserved register

In later versions (7/8)

  • the ADRP/ADD combo is performed every time I try to access something inside the struct.

The net result is slightly larger code that has the potential to cause me issues. There are plenty of unused registers that could be used for the purpose of not constantly re-loading the address of my struct. My current suspicion is that later versions are presuming fewer registers are not being preserved by other function calls, and therefore can’t be relied upon to hold the address of my struct. Assuming this is right, is there some way to encourage the behavior of the older versions?

Thanks,
Robert M

Hi,

I can’t provide my exact problematic code, but here is a trivial example of a .c file that produces different results in clang 6 vs 7. The compiled output from v7 generates two extra adrp instructions. I compiled just with “clang -Oz -c” for both versions.

Rob M

test.c (580 Bytes)

Whoops, forgot an import part of my compile flags :slight_smile:

“clang --target=aarch64-linux-elf -Oz -c”

I did a bit more poking, using llvm/clang versions 6-8. The IR in all cases appears fundamentally identical. I ran the IR generated by version 6 through llc on all three versions. llc-7/8 produced the extra ADRPs, llc-6 did not. So (to my untrained eyes), the IR is generated the same, it is in the IR->AARCH64 asm pass that the extra instructions are being generated.

Bit of a drive-by comment as I haven’t looked at the test case,
but could the tiny code model be helpful here (or is it perhaps related to that)?
Option -mcmodel=tiny was added not that long ago, possibly around that time.

Cheers,
Sjoerd.

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

An good thought. While the tiny model (which I think originates in llvm8?) does compile for this trivial test case, I am unable to use it in my actual case because some of my addresses are too far way for ADRs. Additionally, just trying it on this test case still seems to generate more ADRPs than I would normally think necessary.

Yep, with tiny model there are quite some restrictions, but if it fits then it could be helpful.
If you spot obvious missed cases for this test case, it may be worth raising a bug report on this. I think we would be interested to look at them.

Cheers,
Sjoerd.

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.