Hmm, but will this actually be better than making SourceLocations 64 bits wide? Assuming we use the side table unconditionally, and assuming that most SourceLocations are stored in exactly one place (not true with templates, but often the case otherwise), this means that each time we
store a SourceLocation, we’ll use up 96 bits of storage (32 bits for the SourceLocation and 64 bits for the table entry). That would be 50% worse than the simpler approach of making SourceLocations always be 64 bits wide. If we could use regular 32 bit source locations for smaller values, and
then switch to the side table once we get (say) half way through them, maybe that would be acceptable, but it would mean that we take a constant-factor memory usage (and, due to loss of locality, probably performance) hit for all translation units larger than some threshold. Also, we’d still
need full-width source locations for the spelling and expansion locations of the macro expansion SLocEntrys, so those are going to grow significantly larger – and if we have a large number of macro expansions (such as in the boost case), that’ll be a big cost.
Hmm… yes, that is true. I would expect the SLocEntrys to just be the 64 bit larger version (and the macro expensions would use those instead), so those are equal to the 64 bit size. Spending a bit (the ½ thing above) to switch between the two I think makes this a size regression for anything >1GB (I think we currently handle a 2GB TU). We’d end up taking a size-hit vs just using 64-bit source-locs somewhere between 1GB and 2GB, but I haven’t done the math yet.
I don’t think we’re at the point where we can answer this question yet – I don’t think we sufficiently understand the problem. From what I’ve seen so far, it still seems likely to me that the problem is that we’re allocating too much of the SourceLocation address space for macro expansions >
that don’t need locations to be allocated for them, either due to an outright bug or a pathological case in the boost code.
I’m not sure how we omit those, from my look at the SourceManager it seems that all of those would be necessary (I don’t have a sufficiently good idea of how that works perhaps).
I think it would be worth looking at a SourceManager dump from around the crash and analyzing what we’re spending the source location space on – is this a death by a million cuts situation, or are there some surprisingly large allocations in there? If the former, are there patterns? Are we
generating duplicate locations (eg, repeatedly expanding the same macros with the same arguments in a way we can reuse)? Are there lots of SLocEntries that are unreferenced by file locations (eg, macro expansions that ended up producing no tokens)? There may be some smart way we can
conditionally disable updateLocForMacroArgTokens (https://github.com/llvm/llvm-project/blob/master/clang/lib/Lex/TokenLexer.cpp#L1054) or make it lazy, so we don’t assign full location information to preprocessor-tokens that are formed during macro expansion but don’t result in any
real tokens – or if there isn’t a smart way to do that, maybe we can give boost a way of disabling location tracking for some of its macros.
I should be getting a reproducer that doesn’t have a lot of other stuff happening in the next day or so, so I’ll share that dumpwhen I can.
I think it would also be worth pursuing the 64-bit source locations option (making SourceLocation larger, with no side table) as a configure-time flag, and fixing up all the places where we make the 32-bit assumption. If nothing else, that’ll let us measure the memory and compile time
overhead of doubling our source location storage. I expect that the numbers we get back from that will make that option an immediate non-starter, but perhaps not – in particular, if we could store a SourceLocation as a (FileID, offset) pair, that would save us from doing binary searches to
ocate the FileID for a given SourceLocation, which we do sometimes spend a non-trivial amount of our time doing, and it’d be worth measuring whether we can improve compile time at the cost of memory usage by doing that.
That sounds like a really good starting point. Configure-time I think is the right place to do this. FileID + Offset would be nice (as you say, that search looks really expensive), though would be likely still 64 bits (32 for fileid + 32).
I’d also be interested in seeing whether we can add a flag to turn off tracking macro expansion locations entirely (so we’d only ever track spelling locations). That would result in some QoI loss, but it might also allow cases such as this boost example to work without too much effort from us.
It’d be interesting to see how much maintenance burden we think such a flag would add.
Hmm… I will have to think on that.