getIntrinsicID() optimization

Hi all,

While profiling I discovered that the Function::getIntrinsicID() method is called a lot, and every time it uses string comparison to recompute the ID number. As far as I know the name of an intrinsic function doesn’t change, so the ID could be determined just once at Function construction time.

I’ve attached a patch that does this and it appears to work for the code I tested it with. Could anyone look at potential caveats and if there are none commit it?



FastIntrinsicID.patch (1.54 KB)

Hi Nicolas,

Sorry for the delay. I don’t think that this is the right way to go. Making Function larger is an extremely expensive thing to do. There are use cases in LLVM where apps load large modules lazily and stream in functions on demand. The way this works currently is that the function objects are created eagerly, but their bodies are lazy. Increasing hte size of function can substantially penalize these cases.

Also, as Jeffrey points out, it is not safe with setName().

However, this would be a great thing to have. Instead of adding a new field, is there space in the “SubClassData” field in Value?


Hi Chris,

Function is currently 108 bytes large. Could 4 more bytes really be an
issue? Actually 2 should suffice. While I understand that some applications
value storage more than anything, many applications value compilation time
very highly. getIntrinsicID is called all over the place (isIntrinsic uses
it as well), and every single time it checks the function name. To me that
sounds a lot more dramatic than 2 bytes.

Anyway, using SubclassData could work. It's already being used for the
calling convention, which has values ranging from 0 to 68 (fitting in 7
bits), while intrinsic ID's would take 9 bits. So that would only just work.
However, the calling convention enums could be reordered to only take 3

I was wondering whether it would make sense to have an IntrinsicFunction
class, derived from Function. This way only the IntrinsicFunction would have
an intrinsicID field. I noticed that Function already has virtual methods so
getIntrinsicID could be another virtual method without adding any real
overhead. For Function it would simply return 0 while for IntrinsicFunction
it returns the intrinsicID field that was initialized in the constructor.



Yes. :slight_smile:

I think that fitting it into SubclassData would be best.

It would be fine to have a stateless IntrinsicFunction class (Similar to IntrinsicInst) but it wouldn’t work to add state there. We do need setName etc to work. The bytecode and .ll parsers have to play games when autoupgrading old .ll files, and we want the IR manipulation methods to be simple and consistent where possible. I completely agree that making getIntrinsicID() fast is a worthwhile goal, lets just not lose anything else in the process :slight_smile: