issues with InlineAsm class and #APP/#NOAPP

When the compiler emits assembly code in gcc, there is no #APP/#NOAPP

In my case, I'm creating inline assembly IR as part of the compilation process (not user supplied).

These are for compiler generated stubs.

So I'm seeing these #APP,#NOAPP wrappers which are meant for user inline assembly.
Since I'm generating a lot of inline assembly and then each line is enclosed by this pair, it makes it hard to read the stubs and gcc generated stubs do not have them. Yes, I could buffer them and generate one long string but still that long string will have this wrapper and that's already more work .

It's not hard for me to disable the wrappers by extending our Mips derived version of MCAsmInfo but it seems like this should be solved more generally for everyone. I'm generating a whole stub and not misc. asm code so I can do this easily but in other cases it would not be possible if things were inserted in the middle of other code.

One way would be to add a field to InlineAsm which says whether the compiler or user has requested the inline asm.

Then we need to similarly do something different when the IR is lowered to preserve this
information.

compiler generated inline assembly looks odd. What is it that prevents
the llvm backend from printing the assembly you need for the stubs?

There are a lot of issues.

For one, the function I'm compiling is a mips16 function but the stubs being created are mips32 functions.

This looks similar to thumb x 32 bit arm. Wouldn't a similar solution
work for it?

Cheers,
Rafael

I'm sure there are other ways to do this. Akira and I discussed this at great length and decided
to do this with IR because it seemed much cleaner and had lots advantages.

We also wanted the stubs to be real functions to llvm. That allows them to participate properly
in optimization of various levels (including LTO). They can even be inlined. There are other
planned optimizations that would not work if they were not legitimate functions.

We already needed to implement the ability, which gcc has, to have mips16 and mips32 functions in the same source module, so with that capability, this other work was nearly trivial.
This pass I wrote is very easy to understand if you understand the underlying issues it is
addressing. Arm does not have this ability to compile both thumb1 and ARM functions in the same source file in LLVM so they would not have had the option to do this in the IR.

This mips16 and mips32 floating point interoperability is very complicated and has many cases, compounded by messy issues with endian, static/pic. Mips16 has no floating point but the ABI has things being passed and returned in floating point registers. I don't know what requirements ARM thumb 1 has in this area. The Mips32 code that is interacting in this case
is not soft float; i.e. it's passing parameters and receiving return values in floating point
registers.

I'm almost done now and not going to revisit the whole design at this point.

The only annoying part are these #APP/#NOAPP wrappers.

We also wanted the stubs to be real functions to llvm. That allows them to
participate properly
in optimization of various levels (including LTO). They can even be inlined.
There are other
planned optimizations that would not work if they were not legitimate
functions.

I am not saying that the functions should not exist in the IL, just
that they should not be inline assembly.

Arm does not have this ability to compile both thumb1 and ARM
functions in the same source file in LLVM so they would not have had the
option to do this in the IR.

But llvm has to represent it when one IL file with thumb and one with
ARM code are linked. I think this was one of the motivations for the
current work on extending attributes.

I'm almost done now and not going to revisit the whole design at this point.

Which is a reason why design issues like this should be discussed
early. Was there an email about having the frontend emit inline
assembly early on? If so, sorry I missed it.

I still don't think we should have code where the FE emits inline
assembly. Anything that clang can know about mips16/mips32, llvm can
too. If llvm knows it, all that should be necessary is an attribute in
the function saying "this is a mips32 stub".

The only annoying part are these #APP/#NOAPP wrappers.

Cheers,
Rafael

We also wanted the stubs to be real functions to llvm. That allows them to
participate properly
in optimization of various levels (including LTO). They can even be inlined.
There are other
planned optimizations that would not work if they were not legitimate
functions.

I am not saying that the functions should not exist in the IL, just
that they should not be inline assembly.

Arm does not have this ability to compile both thumb1 and ARM
functions in the same source file in LLVM so they would not have had the
option to do this in the IR.

But llvm has to represent it when one IL file with thumb and one with
ARM code are linked. I think this was one of the motivations for the
current work on extending attributes.

I'm almost done now and not going to revisit the whole design at this point.

Which is a reason why design issues like this should be discussed
early. Was there an email about having the frontend emit inline
assembly early on? If so, sorry I missed it.

Well, it would have been an enormous amount of email to explain this whole problem.
The way this interoperability works for mips16/mips32 is from gcc and there is no documentation on it; you have to read the code of the compiler and old emails from the gcc list to understand it.

One of the things I have assigned to myself as a bug is to finally document this whole
design of the mips16/32 floating point interoperability.

I do very often discuss many of the design issues I'm dealing with and elicit help and opinions from the LLVM list (and have received much very helpful advice that has simplified my work
greatly).

Unfortunately, this issue of assembly code is a minor detail at the end of this and we never
considered that to be important or I would have brought it up.

On the other hand, I'm far from an LLVM expert and I have to do things using the tools
and knowledge I have available to me and this path using IR was very clear to me and I'm very
happy how the design and coding came out. Maybe when you see the patch you can suggest
another way to do this piece of emitting the asm code. I can't guarantee that my management will allow me to spend more time redoing things that already work.

All of this code is isolated in a single module IR pass.

So it will be very easy to replace it with some other pass in the future if someone has the energy to redo it or if other features become available that make it easier. The pass as it stands also documents well the issues involved and how they can be simply addressed.

They have been adding features and optimizations to mips16 gcc for 10 years and lots of people have worked on it and I'm the only one working on it for LLVM. I did none of this
work on gcc for mips16 and only started seriously working on this port at the end of
last May. This mips16/mips32 floating point interoperability is just one of many things I have
to implement.

So i have to just move ahead.

Anyone with energy and interest is welcome to help.

Reed

I'm happy to send you my patch as it stands today.

It's not cleaned up yet all or tested thoroughtly but you can look at what I'm doing and maybe suggest some alternate paths and if it's not a matter of redoing everything, I would consider making some changes.

Here is a sample stub:

Consider this line of code:

extern float fpff(float);

We have no idea if this is a mips16 or mips32 function.

As I scan the IR, if I'm in a Mips16 function and see a call to this function, then I need to generate stub.

The mips16 function code which contains say :
x = fpff(y)

Is compiled in a form of soft float and it is not going to change at all in order to make it interoperate with mips32.

At link time, if the linker sees a stub as shown below, it will change to the call to fpff in the mips16 code to be a call to the stub.

The stub has two tasks:
1) first copy any integer argument registers that would have been using floating registers (if soft float for the mips16 code was not in effect). In this case the integer argument register R4, needs to be copied to floating point register f12.
2) make the actual call to fpff
3) When fpff returns, it must move any floating point registers that are used to return values, to where soft float would have mapped them to. In this case, it means that F0 must be copied to integer return register R2.

So this helper stub will work whether in reality fpff is compiled as mips16 or mips32.

There are similar issues for mip32 code calling mips16 code.
Return types that can be passed in registers include float, double, _Complex float and _Complex double.

Parameter signatures of the form below need to be remapped:

float
double
float, double
float, float
double, double
double, float

  .section .mips16.call.fp.fpff,"ax",@progbits
  .align 2
  .set nomips16
  .set nomicromips
  .ent __call_stub_fp_fpff
  .type __call_stub_fp_fpff, @function
__call_stub_fp_fpff:
  mtc1 $4,$f12
  move $18,$31
  jal fpff
  mfc1 $2,$f0
  jr $18
  .size __call_stub_fp_fpff, .-__call_stub_fp_fpff
  .end __call_stub_fp_fpff

I think that it's possible to do these stubs with clever calling conventions.

I have to add one calling convention anyway to handle the returns properly since those use helper functions with a different calling convention

I have not added a new calling convention yet so when I do and understand better how that works, I'll revisit some things to see if it's possible to do these stubs that way.