LLVM and Cygwin

I got the following error while compiling llvm and clang under cygwin.

/cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/Release+Asserts/lib/libLLVMMCJIT.a(SectionMemoryManager.o):SectionMemoryManager.cpp:(.text+0x3b): undefined reference to __register_frame' /cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/Release+Asserts/lib/libLLVMMCJIT.a(SectionMemoryManager.o):SectionMemoryManager.cpp:(.text+0x3b): relocation truncated to fit: R_X86_64_PC32 against undefined symbol __register_frame’
/usr/lib/gcc/x86_64-pc-cygwin/4.8.1/…/…/…/…/x86_64-pc-cygwin/bin/ld: /cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/Release+Asserts/lib/libLLVMMCJIT.a(SectionMemoryManager.o): bad reloc address 0x0 in section .pdata' collect2: error: ld returned 1 exit status /cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/Makefile.rules:1530: recipe for target /cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/Release+Asserts/bin/lli.exe’ failed
make[2]: *** [/cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/Release+Asserts/bin/lli.exe] Error 1
make[2]: Leaving directory /cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/tools/lli' /cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/Makefile.rules:925: recipe for target lli/.makeall’ failed
make[1]: *** [lli/.makeall] Error 2
make[1]: Leaving directory /cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/tools' /cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/Makefile.rules:876: recipe for target all’ failed
make: *** [all] Error 1
I have no idea what that means.

Hi,

We have a benchmark where there are 128 MAD computations in a loop. (See the attached IR.) Creating SCEVs for these expressions takes a long time, making the compile time too long. E.g., running opt with the “indvars” pass only takes 45 seconds.

It seems that the majority of the time is spent in comparing the complexity of the expression operands to sort them.

I realize that the expression grows to be really large towards the end of the loop.

I don’t know of all the uses of the built SCEV. But I image it won’t be very useful for such complex expressions. Yet, it’s making the compile time much longer.

So I’m wondering if it would make sense to abort the creation of SCEV when the expression gets really complex and large. Or is there any way to further optimize the performance of SCEV building for such cases.

Thanks in advance for any response.

Xiaoyi

MAD_test.ll (11.1 KB)

   Hi,****

** **

We have a benchmark where there are 128 MAD computations in a loop. (See
the attached IR.) Creating SCEVs for these expressions takes a long time,
making the compile time too long. E.g., running opt with the “indvars” pass
only takes 45 seconds.****

** **

It seems that the majority of the time is spent in comparing the
complexity of the expression operands to sort them.

Why not just fix this then?
I assume the issue is that they all end up with the same length/complexity,
so it both falls into the N^2 loop in GroupByComplexity, and the sort
itself takes a long time because it compares operand by operand.

This seems easy to fix by having GroupByComplexity calculate a very cheap
hash prior to sorting when number of operands is large, and then using that
hash before recursively comparing element by element to distinguish the
cases.

(You could also create/update this hash as you go, but that seems like it
would be more work)

Unless of course, they really are all the same expression, in which case,
this is harder :slight_smile:

Thank you very much for your reply.

Do you mean calculate the hash based on element SCEV pointers? But according to the comments before GroupByComplexity():

/// Note that we go take special precautions to ensure that we get deterministic

/// results from this routine. In other words, we don’t want the results of

/// this to depend on where the addresses of various SCEV objects happened to

/// land in memory.

Xiaoyi

Hi Brian,

I got the following error while compiling llvm and clang under cygwin.

/cygdrive/c/Users/brianherman/Desktop/llvm/llvm-3.3.src/Release+Asserts/lib/libLLVMMCJIT.a(SectionMemoryManager.o):SectionMemoryManager.cpp:(.text+0x3b):
undefined reference to `__register_frame'

I register_frame is used to enable the debugger (gdb) to debug JIT'd code. It
is a function provided by libgcc, to be more precise in libgcc_eh. Is it in
your copy?

$ nm libgcc_eh.a | grep register_fram
0000000000001960 T __deregister_frame
0000000000001950 T __deregister_frame_info
0000000000001830 T __deregister_frame_info_bases
0000000000001750 T __register_frame
0000000000001740 T __register_frame_info
00000000000016b0 T __register_frame_info_bases
0000000000001800 T __register_frame_info_table
0000000000001780 T __register_frame_info_table_bases
0000000000001810 T __register_frame_table

Ciao, Duncan.

I get this when I type:

brianherman@windows-8-[REDACTED] ~
$ nm libgcc_eh.a | grep register_frame
nm: ‘libgcc_eh.a’: No such file

brianherman@windows-8-[REDACTED] ~
$ nm libgcc_eh.a | grep register_fram
nm: ‘libgcc_eh.a’: No such file

brianherman@windows-8-[REDACTED] ~
$

$ nm libgcc_eh.a | grep register_frame
nm: 'libgcc_eh.a': No such file

I think he meant to find out where libgcc_eh.a lives under Cygwin and
execute the command on that file. It should be somewhere amongst the
stuff installed with gcc, but the exact location can vary quite a bit.

Cheers.

Tim.

brianherman@windows-8-doesn’t rock /lib/gcc/i686-pc-cygwin/4.7.3
$ nm libgcc_eh.a | grep register_frame
000011b0 T ___deregister_frame
000011a0 T ___deregister_frame_info
000010d0 T ___deregister_frame_info_bases
00000fe0 T ___register_frame
00000fb0 T ___register_frame_info
00000f40 T ___register_frame_info_bases
00001070 T ___register_frame_info_table
00001010 T ___register_frame_info_table_bases
000010a0 T ___register_frame_table

Could it be an issue with my path?

Nice test case. I tried printing the SCEV… oops. I haven’t seen a case this bad, but I know you’re not the first to run into this problem.

There are two steps in GroupByComplexity, sorting (std::sort) and grouping (N^2).

The sort calls SCEVComplexityCompare::compare() which can make multiple recursive calls for nodes with multiple operands. This looks like it could be a disaster for expressions that are not strictly trees–exponential in the size of the DAG.

If you just have a very large tree with many similar looking subexpressions, then I’m not sure what to do except cut it into reasonable subtrees.

AFAICT, it’s not just sorting that’s a problem but also grouping? Also, I think the shear depth of the createSCEV recursion is itself a problem.
I don’t see any reason not to limit the size of SCEV expressions, but I also don’t have a brilliant idea for how to do it at the moment (other than the obvious depth cutoff).

-Andy

Hi Andy,

Thanks very much for looking into the problem.

In this particular test case, it seems most of the time is spent in the sorting, not the grouping.

Later, I realized that it seems in this test case most of the expressions to be compared have different length. I tried the following change in compare() when the LHS and RHS’s types are the same:

No, based on the properties of them (IE type, etc). It will be
entirely deterministic
You have two cases: Either all these SCEV's are really the same, in
which case, this will do nothing
Or they are subtly different, but right now it's comparing 128
operands to find out.

The hash helps with the second case, but not the first.

Yes, this is why i suggested computing some deterministic "complexity
hash" on the way, and caching it in the SCEV.
It would not help if they were all the same, but if they were
different only at the end, you wouldn't end up comparing every operand
to get there.
If they are all the same, you are right that cut-off is the only
reasonable answer.
Or calculate "complexity" in a way that does not require operand by
operand comparison (IE compute a "complexity" number instead of a
hash, as you build the SCEV). It's just trying to get a canonical
sort here.
This would at least make the sort fast, grouping can't be made linear
unless you are willing to trust the hash :slight_smile:

Hi Andy,

Thanks very much for looking into the problem.

In this particular test case, it seems most of the time is spent in the sorting, not the grouping.

Later, I realized that it seems in this test case most of the expressions to be compared have different length. I tried the following change in compare() when the LHS and RHS’s types are the same:

===================================================================
--- lib/Analysis/ScalarEvolution.cpp (revision 187379)
+++ lib/Analysis/ScalarEvolution.cpp (working copy)
@@ -585,6 +585,9 @@
      case scAddExpr:
      case scMulExpr:
      case scSMaxExpr:
      case scUMaxExpr: {
        const SCEVNAryExpr *LC = cast<SCEVNAryExpr>(LHS);
        const SCEVNAryExpr *RC = cast<SCEVNAryExpr>(RHS);

         // Lexicographically compare n-ary expressions.
         unsigned LNumOps = LC->getNumOperands(), RNumOps = RC->getNumOperands();

+ if (LNumOps != RNumOps) {
+ return (int)LNumOps - (int)RNumOps;
+ }
         for (unsigned i = 0; i != LNumOps; ++i) {
           if (i >= RNumOps)
             return 1;

And the compile time is cut down from 45s to 1s.

Committed r187475. Thanks!
-Andy

That would be ideal. I wish it were obvious to me how to implement it.

Xiaoyi’s simple fix handled this scenario and was totally consistent with the current implementation. It seems we’re not yet running into the issue of visiting all paths in a DAG. I don’t want to discourage a more general fix/redesign though. We could probably recover more compile time in many cases.

-Andy