[RFC] Queries for LLVM version

This has actually come up in the context of a JIT, but I think that
having the functionality in general could be useful.

Currently, there does not appear to be an API in LLVM to query for
LLVM version information. In the particular context where this came
up, LLVM is used as a shared library and various functionality (and
bug fixes) used by the JIT is available in various LLVM versions. So
it would be quite convenient to be able to dynamically determine the
version that happens to be loaded.

Honestly, I am not completely clear on what the best place for
something like this would be, but it appears that the following

seems

like a natural choice:

llvm::VersionPrinter in lib/Support/CommandLine.cpp already queries
this data so it might make sense for it to expose the following

API's

(as part of VersionPrinter, accessed through the instance):
llvm::cl::getVersionMajor()
llvm::cl::getVersionMinor()
llvm::cl::getVersionPatch()

Hi,

I'm also interested in querying this at runtime. Has there been any
patches submitted for this yet?

Thanks,
Tim

I am certainly willing to write up a patch to do this. I was hoping to get a few more responses/buy-in from the community.

And honestly, I’d be interested to know if this can be back-ported to at least 4.0 if it gets accepted.

Now that I see that there’s more interest in this than just my own, I’ll put up a patch on Phabricator.

I am certainly willing to write up a patch to do this. I was hoping
to get a few more responses/buy-in from the community.
And honestly, I'd be interested to know if this can be back-ported to
at least 4.0 if it gets accepted.

My *possible* use case is for identifying the version of llvm used to
compile shaders for AMD gpus. This information would be used with an
on-disk cache of compiled shaders allowing old cache items to be
ignored if llvm is upgraded.

One downside of this is there is no way to know if distros backport
fixes to llvm. If the version is not bumped then the a buggy cached
shader would continue to be used even after the distro makes the fix.

Now that I see that there's more interest in this than just my own,
I'll put up a patch on Phabricator.

Cool, thanks!

This is exactly why we don’t use this kind of versioning: it is fragile/unreliable.

So Mehdi,

you’re not in favour of providing this functionality then? Or just not in favour of that use case?

So Mehdi,

you’re not in favour of providing this functionality then?

I’m not against the functionality, but I’m not sure the API is the right one for the use-case.

Or just not in favour of that use case?

The use-case (object cache) makes perfect sense, but I wouldn’t use this API.
I’d like use a const char *getLLVMVersion() API that would be documented as returning an opaque string representing the LLVM version.
The reason to return an opaque string is to prevent (not encourage…) users to parse it. It could return something like “LLVM 4.0.0svn(r12345)”.
Using this as part of your cache key makes it more robust to switching a given version of the compiler.

But ultimately, if I had to build such a caching system*, I’d likely do it upstream in LLVM and design the API to such that it would “just work” for my shader driver and could be maintained/improved by the community and reused by other users.

Would it make sense to:

  1. Provide the integer Major/Minor/Patch API’s for use cases where the user wants to check that the loaded version is at least some version that contains a fix they’re after
  2. Provide a string API for the caching use case you outlined

The major, minor and patch level only works for "released" versions. If,
for some reason, you then update to a version of LLVM that isn't a true
release, what are those values? The "next" one, the "previous" one,
something else? And what happens if you then pick up the next 583 commits
that came in over night - which just so happens "breaks" the compatibility,
or make some local changes to implement an intrinsic function, fix a linker
bug, or whatever it is that you need to do to improve/fix your product?

Nobody knows when and how these numbers reflect the details of your current
release.

Providing a string, which is not just some integer values and can change to
"anything", based on for example a sha1 in git or a revision number in svn,
is a much more reliable choice to know EXACTLY if it's the same source-code
(and thus a compatible or incompatible "version"). It either matches, or it
doesn't. There's no two versions that have the same identification,
regardless of whether some distribution updated LLVM with their own fixes,
whether the vendor that have their own backend uses a new version of their
backend code [assuming it's part of the LLVM repo, at least - if it's not,
perhaps they should reconsider that], etc, etc.

So the general consensus then is that it only makes sense to provide the string API? This would allow the user to determine that two versions of LLVM are identical/different. However no API should be provided that would allow the user to query whether the version is “at least/at most” some version or “between” two versions?

I hope I am not out of line by asking how providing these major/minor/patch versions in an API is fundamentally different to providing them as macros. I imagine users of these macros guard their code with these macros so they don’t use something known to be missing or buggy. So I don’t see how doing something similar at runtime in a JIT context is so fundamentally flawed. As a concrete example that motivated the RFC:

  • there was a bug in our back end that we fixed for the 3.9 release

  • a package that uses LLVM as a JIT has a workaround for that bug that hurts performance

  • the package decides to use the workaround or not based on the version macros

  • the package maintainers would prefer to be able to do this check at run-time so if a more recent version is loaded, the workaround is not used

So the general consensus then is that it only makes sense to provide the string API? This would allow the user to determine that two versions of LLVM are identical/different.

That was only the answer I had to give for the caching use-case, you can see my answer to this as orthogonal to asking if we should or not add the proposed API here.

However no API should be provided that would allow the user to query whether the version is “at least/at most” some version or “between” two versions?

I hope I am not out of line by asking how providing these major/minor/patch versions in an API is fundamentally different to providing them as macros. I imagine users of these macros guard their code with these macros so they don’t use something known to be missing or buggy. So I don’t see how doing something similar at runtime in a JIT context is so fundamentally flawed. As a concrete example that motivated the RFC:

  • there was a bug in our back end that we fixed for the 3.9 release

  • a package that uses LLVM as a JIT has a workaround for that bug that hurts performance

  • the package decides to use the workaround or not based on the version macros

  • the package maintainers would prefer to be able to do this check at run-time so if a more recent version is loaded, the workaround is not used

I agree with mats that any of what you’re describing above would be broken by linking to a non-vanilla release.
There are many people that branch LLVM at different point, the upstream release number are not meaningful there.

That said, I sympathize for the use-case, and in the absence of a better solution I wouldn’t be against adding this.