How to optimize store of constant arrays

Hi all, I have this problem:

I’m using LLVM’s C++ API, mostly IRBuilder<> to generate code. Some of the generated code is just storing constant data to a location provided as a function argument. Something like ir_builder.CreateStore(get_default_data(), ptrValue) where get_default_data() may return a ConstantArray of i8 and ptrValue is a function argument. Looking at the generated assembly, it seems inefficient, it generates a separate instruction for storing each byte relative to the pointer. Each instruction takes several bytes, and I am trying to optimize for code size.

The Q: What should I do to make the generated code more size-efficient (think -Os)?

Current ideas, which may not include the best answer:

  • Store the data as-is in the IR, and use the MemCpyInst instead of the store when get_default_data() returns something big?
  • Add a custom – or any available built-in – pass to the output to transform the store instns to memcpy when appropriate?

Thanks for any help,
Boldizsar

This message and any attachments are intended for the use of the addressee or addressees only.
The unauthorised disclosure, use, dissemination or copying (either in whole or in part) of its
content is not permitted.
If you received this message in error, please notify the sender and delete it from your system.
Emails can be altered and their integrity cannot be guaranteed by the sender.

Please consider the environment before printing this email.

You could create a statically initialized global variable containing the constant data, and then insert a call to memcpy to copy it over to your destination.

-Krzysztof

Regarding the memcpy intrinsic---it will be expanded into individual loads/stores when the amount of data is small, depending on your target settings.
In any case, these two approaches you have below are pretty much equivalent. Use existing intrinsics whenever you can, or otherwise you'd need to add code to handle them in the codegen.

-Krzysztof