flag_unit_at_a_time and pass scheduling in llvm-gcc

In llvm-backend.cpp I see:

    if (optimize > 1) {
      if (flag_inline_trees > 1) // respect -fno-inline-functions
        PM->add(createFunctionInliningPass()); // Inline small functions
      if (flag_unit_at_a_time && !lang_hooks.flag_no_builtin())
        PM->add(createSimplifyLibCallsPass()); // Library Call Optimizations

      if (optimize > 2)
        PM->add(createArgumentPromotionPass()); // Scalarize uninlined fn args
    }

Shouldn't createFunctionInliningPass and createArgumentPromotionPass only be
called if flag_unit_at_a_time is true? As far as I can see flag_unit_at_a_time
is used to control whether inter-procedural/whole-module passes are scheduled.

Thanks,

Duncan.

You can do inlining even when flag_unit_at_a_time is off. And one can enable unit-at-a-time without enabling any optimizations. The unit-at-a-time is not meant to select optimization passes, though it may influence selection.

Originally, this flag instructs gcc to parse entire source file before producing code. I am told that originally gcc worked on one statement at a time (stmt->parse->optimize->codegen->next-stmt). Later on it was enhanced to work on a function at a time. Next logical step was to work on a source file at a time. IIRC, the flag was required because some of the lang. FE produced parsed trees caused huge amount of memory pressure during code generation making unit-at-a-time not suitable for all languages.

Everything is right except for your last sentence or two :slight_smile:
The real reason for flag_unit_at_a_time was that some
programs/libraries (glibc in particular) required the top level ASM
statements to be output in the same order and place they appear in the
source file (they were using it to change sections, etc). The
original unit_at_a_time mode did not do this, so we needed a flag to
turn it off in order not to break glibc.
At some point we gave up hope that they would change their ways and
implemented tracking the order of the top level asm statements and
outputting them relative to where they appear in the original source
file.

Hi Devang,

You can do inlining even when flag_unit_at_a_time is off. And one can
enable unit-at-a-time without enabling any optimizations. The unit-at-
a-time is not meant to select optimization passes, though it may
influence selection.

this flag is used quite a bit in llvm-backend.cpp, for example:

    if (flag_unit_at_a_time) {
      PM->add(createGlobalOptimizerPass()); // Optimize out global vars
      PM->add(createGlobalDCEPass()); // Remove unused fns and globs
      PM->add(createIPConstantPropagationPass()); // IP Constant Propagation
      PM->add(createDeadArgEliminationPass()); // Dead argument elimination
    }

I thought I understood why but it seems that I don't :slight_smile:

Ciao,

Duncan.

IMO, we should avoid using flag_unit_at_a_time here.

Hi Devang,

> this flag is used quite a bit in llvm-backend.cpp, for example:
>
> if (flag_unit_at_a_time) {
> PM->add(createGlobalOptimizerPass()); // Optimize out
> global vars
> PM->add(createGlobalDCEPass()); // Remove unused
> fns and globs
> PM->add(createIPConstantPropagationPass()); // IP Constant
> Propagation
> PM->add(createDeadArgEliminationPass()); // Dead argument
> elimination
> }
>
> I thought I understood why but it seems that I don't :slight_smile:

IMO, we should avoid using flag_unit_at_a_time here.

given DannyB's explanation that this flag exists in gcc so that glibc
works properly in spite of abusing ASM, perhaps this logic in llvm-backend
also exists to ensure that glibc works?

Ciao,

Duncan.

Duncan Sands wrote:

Hi Devang,

this flag is used quite a bit in llvm-backend.cpp, for example:

   if (flag_unit_at_a_time) {
     PM->add(createGlobalOptimizerPass()); // Optimize out
global vars
     PM->add(createGlobalDCEPass()); // Remove unused
fns and globs
     PM->add(createIPConstantPropagationPass()); // IP Constant
Propagation
     PM->add(createDeadArgEliminationPass()); // Dead argument
elimination
   }

I thought I understood why but it seems that I don't :slight_smile:
      

IMO, we should avoid using flag_unit_at_a_time here.
    
given DannyB's explanation that this flag exists in gcc so that glibc
works properly in spite of abusing ASM, perhaps this logic in llvm-backend
also exists to ensure that glibc works?

Would changing handling of flag_unit_at_a_time solve PR2143, and allow
glibc to be compiled by llvm-gcc?

Best regards,
--Edwin

Hi Devang,

this flag is used quite a bit in llvm-backend.cpp, for example:

  if (flag_unit_at_a_time) {
    PM->add(createGlobalOptimizerPass()); // Optimize out
global vars
    PM->add(createGlobalDCEPass()); // Remove unused
fns and globs
    PM->add(createIPConstantPropagationPass()); // IP Constant
Propagation
    PM->add(createDeadArgEliminationPass()); // Dead argument
elimination
  }

I thought I understood why but it seems that I don't :slight_smile:

IMO, we should avoid using flag_unit_at_a_time here.

given DannyB's explanation that this flag exists in gcc so that glibc
works properly in spite of abusing ASM, perhaps this logic in llvm-backend
also exists to ensure that glibc works?

I have no idea what this is the case or not.

I think it would be reasonable to turn off some of the more aggressive IPO xforms when -fno-unit-at-a-time is set. That flag basically indicates that the code is doing something fishy, and it isn't in wide use, so I don't see a big problem with that.

Note that currently we can't support the glibc bug. The issue is that we don't keep track of the relative positions of module-level inline asm and functions. Instead, we aggregate all module-level asm together and emit it as a single blob.

-Chris

GCC actually does this in a very simple way.
we have flag_no_toplevel_reorder and in this mode, simply record the
order we saw function, module level asm, and module level vars in as a
single monotonicly increasing number.
The numbers get attached to the associated nodes (IE cgraph nodes,
varpool nodes, and asmpool nodes)
We guarantee in this mode that we will output all the
vars/functions/etc that originally appeared.
(IE we do no IPA opts, etc, we just expand the functions through the
backend and perform backend opts).
When we go to output, we just sort the stuff back according to the
number and output it.