how to implement passing/returning aggregates by value in registers

Hi all,
some time ago I asked on LLVMdev how to implement
passing/returning aggregates by value in registers. I was told it should
be implemented in FE, but to be honest after some digging in source code
I couldn’t find what I need.

My target’s ABI says that aggregates should be padded to a multiple of 32 bit
and placed in as many stack slots as needed.
Always the first 8 argument slots are placed in registers and rest on stack.

Return values up to 32 bytes are placed also in up to 8 registers.
Larger than 32 bytes are returned in a buffer allocated by caller.

Could anyone give me advice where/how this could be done or where
in the source code I should look?

Thanks for your help,
Artur