Are there implicit rules or conventions for an llvm frontend to generate llvm IR?

Hi, this question might be a bit silly: apart from the language
reference(LLVM Language Reference Manual — LLVM 16.0.0git documentation) page, are
there additional rules for a regular llvm frontend to generate llvm IRs?

There are a few cases that I got from clang/llvm-gcc/dragonegg when
compiling *C* source code into llvm IR:

1. It seems that there is ONLY ONE ReturnInst(and NO InvokeInst) for such
llvm IR; is it legal to add other *ReturnInst*s when transforming?

2. Is it possible for a frontend to generate a function whose CFG is
something like:

            bb0
       / \
     bb1 bb2
   / \ / \
bb3 bb4 bb5
   \ | /
     \ | /
       \ | /
              bb6

(In this case, if I understand correctly, bb4 is control dependent on both
bb1 and bb2.)
I think it at least possible in theory, and there is a simple case:

int foo(int i) {
  if (i < 0) {
    if (i % 2 == 0) {
      i += 1;
    } else {
      i += 2;
    }
  } else {
    if (i % 2 == 0) {
      i += 1;
    } else {
      i += 2;
    }
  }
return 0;
}

However none of the frontends I used generate the basicblocks like
that(there is always one or more basicblocks generated) /without any
optimizations/. So is there any implicit rules for these frontends?

And can I rely on these cases when I ONLY deal with C source code?

Thanks!

Hi, this question might be a bit silly: apart from the language
reference(LLVM Language Reference Manual — LLVM 16.0.0git documentation) page, are
there additional rules for a regular llvm frontend to generate llvm IRs?

There are a few cases that I got from clang/llvm-gcc/dragonegg when
compiling *C* source code into llvm IR:

1. It seems that there is ONLY ONE ReturnInst(and NO InvokeInst) for such
llvm IR; is it legal to add other *ReturnInst*s when transforming?

An LLVM function can have multiple ReturnInsts as long as each one terminates a basic block. There is a transform (UnifyExitNodes, IIRC) that will take a function with multiple ReturnInsts and create one with a single ReturnInst. Having a single ReturnInst (exit node) simplifies other analyses.

2. Is it possible for a frontend to generate a function whose CFG is
something like:

             bb0
        / \
      bb1 bb2
    / \ / \
bb3 bb4 bb5
    \ | /
      \ | /
        \ | /
               bb6

(In this case, if I understand correctly, bb4 is control dependent on both
bb1 and bb2.)
I think it at least possible in theory, and there is a simple case:

Yes, that looks fine to me. One of the LLVM passes might optimize that CFG or put it into some canonical form, but that CFG looks fine to me.

-- John T.

Hi, this question might be a bit silly: apart from the language
reference(http://llvm.org/**docs/LangRef.html#switch-**instruction&lt;http://llvm.org/docs/LangRef.html#switch-instruction&gt;\)
page, are
there additional rules for a regular llvm frontend to generate llvm IRs?

There are a few cases that I got from clang/llvm-gcc/dragonegg when
compiling *C* source code into llvm IR:

1. It seems that there is ONLY ONE ReturnInst(and NO InvokeInst) for such
llvm IR; is it legal to add other *ReturnInst*s when transforming?

An LLVM function can have multiple ReturnInsts as long as each one
terminates a basic block. There is a transform (UnifyExitNodes, IIRC) that
will take a function with multiple ReturnInsts and create one with a single
ReturnInst. Having a single ReturnInst (exit node) simplifies other
analyses.

Thanks so much, John; especially for pointing out 'UnifyExitNodes' pass!

2. Is it possible for a frontend to generate a function whose CFG is
something like:

             bb0
        / \
      bb1 bb2
    / \ / \
bb3 bb4 bb5
    \ | /
      \ | /
        \ | /
               bb6

(In this case, if I understand correctly, bb4 is control dependent on both
bb1 and bb2.)
I think it at least possible in theory, and there is a simple case:

Yes, that looks fine to me. One of the LLVM passes might optimize that
CFG or put it into some canonical form, but that CFG looks fine to me.

-- John T.

Got it!

And can I say : as long as it is not explicitly in llvm language reference,
there are generally no restrictions for frontends/transformations to
generate IR(of course, they pass the verifier)?

As far as I know, all such restrictions are (or should be) documented in the LLVM Language Reference Manual. That said, you should not generate irreducible CFG’s (e.g., a CFG with a branch into the middle of a loop). Those don’t play well with compiler analyses. :slight_smile: – John T.