Reducing overhead of Verifier on complex data

I have an IR file that takes an unacceptably long time to optimize and generate code for.

I am using LLVM 3.5.1 for now.

If I run

opt -time-passes -std-compile-opts -o opt.bc my.ll

opt takes about 3-4 minutes, only to report that it spent about 25 seconds optimizing the code.

After running perf, it appears to me that opt is not accounting for a lot of time spent in Verifier::visitGlobalVariable(). This routine appears to traverse complex initializer expressions looking for bitcasts between address spaces. Such bitcasts do not occur in my code - I use only the default address space.

My IR has lots of complex interlinked data structures, e.g.:

@xxx.Std = constant [24 x i64] [i64 5665, i64 ptrtoint ([13 x i8]* @25342 to i64), i64 72, i64 ptrtoint ([2360 x i64]* @yyy.Std to i64), i64 ptrtoint ([4 x i64]* @25343 to i64), i64 ptrtoint ([11 x i8]* @25344 to i64), i64 24, i64 -1, i64 ptrtoint ([4 x i64]* @25345 to i64), i64 ptrtoint ([6 x i8]* @25346 to i64), i64 48, i64 -1, i64 ptrtoint ([2 x i64]* @25348 to i64), i64 ptrtoint ([10 x i8]* @25349 to i64), i64 0, i64 -1, i64 ptrtoint ([2 x i64]* @25351 to i64), i64 ptrtoint ([4 x i8]* @25352 to i64), i64 0, i64 -1, i64 0, i64 0, i64 0, i64 0]

@yyy.Std likely contains a reference to @xxx.Std.

These data structures document the layout of types and variables in the program in ways that runtime functions can easily use.

If I hand-edit my IR to remove all function definitions, replacing them with equivalent declarations, but otherwise keep all type and data definitions the same, then opt takes almost just as long, but happily reports that it was able to optimize my code in 0.2 seconds. Not bad, because there is no code. :slight_smile:

My question: what can I do about this?

  • Does it make sense to disable the verifier in production?

  • Can I recode my data declarations in some way to reduce the impact? I was planning on changing the format of the data structures; the planned changes would remove the ptrtoint instructions. Would doing so help?

  • Would it help to reduce the amount of other constant data in the IR? Although LLVM itself collapses duplicate constant data, I could easily do so myself. Would doing so help?

(Note: the interlinked data structures do not contain duplicate data, so any attempt at manually reducing the duplicate data would have zero impact on the linked structures.)

My first suggestion would be to improve the code in the verifier that’s problematic for you. I suspect you could make that piece of code far faster and the patch would be easy to get accepted upstream. Two particular suggestions: - Filter common elements before adding to worklist. If you use a SmallVectorSet rather than the two data structures currently used, this will happen for you. - Increase the ‘small size’ of the SmallX data structures to something reasonable like 32. This will get the algorithm running entirely in stack memory for your example rather than allocating for each visited global. After those two, I suspect you’ll see this function disappear from your profile.