Machine Code for different architectures

How does LLVM generate machine code for different architectures?
For example, the machine code for x86 and amd will vary.

How does LLVM convert its IR to machine code for different architectures.Can you please explain the approach? Is it just write two different programs for two different architectures and pass a flag to the compiler based on which machine code you want to generate?

Thanks a lot for your explanations.



We have some DSP architectures (kalimba) which have 24-bits as their
"minimum addressable unit". So this means that the sizeof a char (and
an int and a short for that matter) is 24-bits.

I quickly read the posted link WritingAnLLVMBackend.html but did not
see an obvious answer to the following question:

Is it possible to write a backend that faithfully represents these
architectures or is sizeof_byte==8 bits baked into to llvm?

Does anyone have any views on the above?

Matthew Gardiner

Hi Matthew,

The byte==8 bits is more of a Clang issue rather than an LLVM issue. I believe your bigger issue will be the fact that you would need to make i24’s a legal type in your backend, which as far as I know (unless something has changed recently), is a big job. I briefly looked into it at one point, and decided to leave it for another day.

I am also unsure how hard byte=8bits is backed into Clang. You might want to ask cfe-dev.

I hope that helps.


Hi Johnny,

Thanks for this - particularly the tip about cfe-dev. I'm currently
trying to coerce lldb to debug these type of architectures (our
current toolchain already outputs good dwarf info). However, I'm
struggling since lldb has just assumes that the size of a byte is
universally 8-bits. At some stage, I *think* at some stage we'd like to
derive a compiler, from the "same code-base" (i.e. llvm) and I
wondered how tricky this would be.


Hi Matt,

We maintain a set of patches to the LLVM codebase for 16-bit bytes, and non-power-of-2 register sizes. Support for non-power-of-2 register sizes and the addition of new machine value types, such as i24, is a "medium-sized" patch set. Support for 16-bit bytes is a quite large patch set, and it may be even larger after cleanup. (Usually, we just need to parameterize the byte size, or replace a size in bytes with a size in bits, but for a few instances, we currently have switch statements, handling 8-bit and 16-bit byte size separately, and leaving other sizes unimplemented.)

We currently don't use the code from any other LLVM project, such as clang or lldb.

Our plan is to someday clean up the LLVM patches, and submit them upstream. In the meantime, I can to provide them upon request.

/Patrik Hägglund

Hi Matt,

Another problem, specific for DWARF, is that in some places we want to have "target byte size" (for all target-dependent debug information) and in other places 8-bit byte size (for target-independent encoding of data - DWARF4 chapter 7). This means that, for us, some assembler-emitting functions need to be able to handle _both_ 8-bit bytes and 16-bit bytes. We have solved this by optionally changing some parameters to passing sizes in bits, rather than bytes. However, support of debug information is far from finished for our target.

/Patrik Hägglund

Hi Patrik,

Thanks for this note. It's encouraging to read there has been some
provision made for non-8-bit bytes. I'm not a compiler/backend expert,
(although maybe I'll need to be soon!), so I won't look at the patches
right now, however may at some stage in the future myself or colleague
may request these patches from yourself.

Yes, our 24-bit architectures have non-power-of-2 register sizes.

When you mentioned addition of i24 - would that facilitate this
architecture to claim that it's bytes are 24-bits in size? (Sorry for
being vague, but I'm a debugger person currently...)

thanks again for your help,

Hi Patrik,

I see what you mean - certain areas of code need to have 2 notions of
byte-size - for the target and for the host.


Hi Matt,

When you mentioned addition of i24 - would that facilitate this
architecture to claim that it's bytes are 24-bits in size?

No, adding i24 to the set of machine value types (MVTs), just say that we may have a register of that size (which our target has).

Our solution to specify the byte size is to extend the DataLayout class with a new field BitsPerByte, and a corresponding entry ("B16") in the datalayout string. This is used to which specify the byte size to 8 by default, and to 16 for our target. The BitsPerByte field is then used to parameterize the byte size all over the code.

/Patrik Hägglund

Hi Patrik

Thanks again. I'll catch up with you later (I hope!), when myself or a
colleague require any of those patches, or further advice.