My goal is to create a language which can compile itself, therefore I
feel I need to understand the Assembler/Byte code format.
Starting with a C hello world program there are statements at the
beginning of the disassembled bc file that I couldn't find any
Sorry, this really should be in the LangRef.html file, but isn't for some reason.
These are "self-explantory" but what are the possible values and what
other items should I be prepared to output.
target endian = little
target pointersize = 32
Three possible options for each of these:
Note that if your front-end is "type safe", i.e. it's not possible to write bad pointer casts, and your front-end isn't doing pointer arithmetic in a bad way, you shoouldn't need these.
target triple = "i686-pc-linux-gnu"
This is a standard target triple. The ones supported by the X86 backend are listed here:
deplibs = [ "c", "crtend" ]
This is information passed on to the linker to specify which libraries the code depends on. In this case, the linker will try to link in libc and libcrtend. Generally you want to add the runtime library for your langauge to this list. If your language has the capability to figure out which libraries are required by the user application, you can also have it add them here.
This is to support things like a hypothetical C compiler with a pragma in "math.h". Whenever the user code #includes math.h, this pragma would inform the linker to link in libm.