Machine Code for different architectures

Prakash_Premkumar · September 9, 2014, 6:09am

How does LLVM generate machine code for different architectures?
For example, the machine code for x86 and amd will vary.

How does LLVM convert its IR to machine code for different architectures.Can you please explain the approach? Is it just write two different programs for two different architectures and pass a flag to the compiler based on which machine code you want to generate?

Thanks a lot for your explanations.

Thanks
Prakash

Bruce_Hoult1 · September 9, 2014, 6:21am

http://llvm.org/docs/WritingAnLLVMBackend.html

Matthew_Gardiner · September 9, 2014, 6:41am

Hi,

We have some DSP architectures (kalimba) which have 24-bits as their
"minimum addressable unit". So this means that the sizeof a char (and
an int and a short for that matter) is 24-bits.

I quickly read the posted link WritingAnLLVMBackend.html but did not
see an obvious answer to the following question:

Is it possible to write a backend that faithfully represents these
architectures or is sizeof_byte==8 bits baked into to llvm?

Does anyone have any views on the above?

thanks
Matthew Gardiner

Johnny_Val · September 9, 2014, 10:49am

Hi Matthew,

The byte==8 bits is more of a Clang issue rather than an LLVM issue. I believe your bigger issue will be the fact that you would need to make i24’s a legal type in your backend, which as far as I know (unless something has changed recently), is a big job. I briefly looked into it at one point, and decided to leave it for another day.

I am also unsure how hard byte=8bits is backed into Clang. You might want to ask cfe-dev.

I hope that helps.

Johnny

Matthew_Gardiner · September 9, 2014, 12:20pm

Hi Johnny,

Thanks for this - particularly the tip about cfe-dev. I'm currently
trying to coerce lldb to debug these type of architectures (our
current toolchain already outputs good dwarf info). However, I'm
struggling since lldb has just assumes that the size of a byte is
universally 8-bits. At some stage, I *think* at some stage we'd like to
derive a compiler, from the "same code-base" (i.e. llvm) and I
wondered how tricky this would be.

Matt

Patrik_Hagglund · September 9, 2014, 6:55pm

Hi Matt,

We maintain a set of patches to the LLVM codebase for 16-bit bytes, and non-power-of-2 register sizes. Support for non-power-of-2 register sizes and the addition of new machine value types, such as i24, is a "medium-sized" patch set. Support for 16-bit bytes is a quite large patch set, and it may be even larger after cleanup. (Usually, we just need to parameterize the byte size, or replace a size in bytes with a size in bits, but for a few instances, we currently have switch statements, handling 8-bit and 16-bit byte size separately, and leaving other sizes unimplemented.)

We currently don't use the code from any other LLVM project, such as clang or lldb.

Our plan is to someday clean up the LLVM patches, and submit them upstream. In the meantime, I can to provide them upon request.

/Patrik Hägglund

Patrik_Hagglund · September 9, 2014, 7:31pm

Hi Matt,

Another problem, specific for DWARF, is that in some places we want to have "target byte size" (for all target-dependent debug information) and in other places 8-bit byte size (for target-independent encoding of data - DWARF4 chapter 7). This means that, for us, some assembler-emitting functions need to be able to handle _both_ 8-bit bytes and 16-bit bytes. We have solved this by optionally changing some parameters to passing sizes in bits, rather than bytes. However, support of debug information is far from finished for our target.

/Patrik Hägglund

Matthew_Gardiner · September 10, 2014, 6:15am

Hi Patrik,

Thanks for this note. It's encouraging to read there has been some
provision made for non-8-bit bytes. I'm not a compiler/backend expert,
(although maybe I'll need to be soon!), so I won't look at the patches
right now, however may at some stage in the future myself or colleague
may request these patches from yourself.

Yes, our 24-bit architectures have non-power-of-2 register sizes.

When you mentioned addition of i24 - would that facilitate this
architecture to claim that it's bytes are 24-bits in size? (Sorry for
being vague, but I'm a debugger person currently...)

thanks again for your help,
Matt

Matthew_Gardiner · September 10, 2014, 6:17am

Hi Patrik,

I see what you mean - certain areas of code need to have 2 notions of
byte-size - for the target and for the host.

Matt

Patrik_Hagglund · September 10, 2014, 6:39am

Hi Matt,

When you mentioned addition of i24 - would that facilitate this
architecture to claim that it's bytes are 24-bits in size?

No, adding i24 to the set of machine value types (MVTs), just say that we may have a register of that size (which our target has).

Our solution to specify the byte size is to extend the DataLayout class with a new field BitsPerByte, and a corresponding entry ("B16") in the datalayout string. This is used to which specify the byte size to 8 by default, and to 16 for our target. The BitsPerByte field is then used to parameterize the byte size all over the code.

/Patrik Hägglund

Matthew_Gardiner · September 10, 2014, 7:40am

Hi Patrik

Thanks again. I'll catch up with you later (I hope!), when myself or a
colleague require any of those patches, or further advice.

thanks
Matt

Topic		Replies	Views
Non-standard byte sizes LLVM Dev List Archives	4	85	February 1, 2011
Non-byte-oriented targets? Beginners	2	313	November 23, 2021
backend question LLVM Dev List Archives	4	87	March 14, 2011
Re-targeting clang to a new architecture Clang Frontend	8	84	May 4, 2010
llvm/clang and 'odd bit types' LLVM Dev List Archives	1	95	September 24, 2013

Machine Code for different architectures

Related Topics