Opinions on conditional dependencies on proprietary tools

During review on https://reviews.llvm.org/D121727, which added conditional dependency on NVIDIA’s proprietary assembler ptxas to NVPTX back-end tests, @thakis raised a concern about relying on proprietary tools and that this should be discussed in a wider forum.

If anyone has thoughts/ideas on the subject, please chime in.

Here’s a partial recap of the key points of the conversation on the review tracker.

@thakis (https://reviews.llvm.org/D121727#3479832):

It looks very weird to me to have LLVM’s test suite support calling some proprietary tool. I can see how this is useful, but imho it doesn’t belong into LLVM itself.

This is both for practical and other reasons.

From a practical point of view, if we rely on this tool for testing in LLVM, it means we can’t run LLVM’s tests on platforms that that tool doesn’t run on.

Independent of practical considerations, LLVM tries to be an open-source, standalone toolchain development suite. Making it depend on proprietary programs, even just optionally and at the test level, works against both these goals.

How do others feel about this?

@pogo59 summarized the rationale for using this approach:

This feels like (maybe is?) supporting a target whose assembler happens not to be generally distributed, but which LLVM still supports as a target. For those environments/bots that actually have the assembler installed, having that extra level of validation seems worthwhile.

@thakis (https://reviews.llvm.org/D121727#3491089)

If we had used the strategy here in the clang-cl project, I feel we would’ve ended up in a much less useful place. Now we have (cross-platform!) PDB writers and dumpers and whatnot in llvm. If we had (optionally) relied on proprietary tools like this here does, that wouldn’t have happened.

Specifically in case of the dependency on ptxas, it’s highly unlikely that we’ll ever have an open-source replacement for NVIDIA’s proprietary assembled for their GPUs. NVIDIA so far has not opened ISA for any of their GPUs and so far there’s no indication that they will consider doing it in the future.

IMO, being able to test that LLVM generates valid assembly (and ptxas is the source of the ground truth on whether that assembly is valid or not) beats not having those tests and discovering issues only after the broken code has been committed to the source tree and picked up by the bots. Considering that this dependency is conditional and does not affect the developers who do not enable it explicitly, I do not see any practical downsides in this specific case.

As things stand, I believe that @thakis’ concern that it would discourage development of open-source replacement of ptxas is not applicable here. We do not have a viable path forward towards such goal. Even assuming that we would be able to implement a replacement, we’d still need to test LLVM’s compatibility with ptxas because even in the best case LLVM will always lag behind NVIDIA’s own tools in terms of support for new GPU variants and we will need to rely on ptxas to target those new GPUs.

That said, It would still be good to reach a consensus on the dependency on proprietary tools in general and document it.

FWIW, our testing for GPUs is problematic for many reasons. We should have buildbots that execute on GPUs. They will have to rely on vendor tools. I don’t see that changing any time soon. Even if we would implement ptxas, we still need vendor tools (and drivers) for the next step, so it is unclear to me if we really want to wait for a fully open toolchain until we stop testing GPU code (generation) outside the project only.

Agreed, and we do have the bots (e.g clang-cuda-t4) for running CUDA tests on three major GPU architectures. However, it’s only a subset of external dependencies.

For syntax checking we don’t need a GPU, just some tools from the SDK.

I agree with @thakis that development of an open-source replacement tools should be prioritized over using proprietary tools, but I don’t think we should disallow proprietary tools either. Developing a compiler (ptxas is an optimizing assembler for multiple GPU architectures) for a GPU without available documentation is not feasible, IMO, especially considering that we’d have to repeat it for every new GPU generation which tends to come every couple of years.

I think for regression tests specifically, we want to ensure that the experience developers see locally is the same as what the buildbots are testing, to the largest extent possible. Having the semantics of the test change based on what tools are installed locally is confusing. (The issue isn’t really that the tool is proprietary; the issue is that it’s not in the LLVM tree, and it’s not in the list of tools we expect to be installed in the “Getting Started” guide.)

That isn’t to say we shouldn’t have buildbots that run GPU tests, but I think it belongs elsewhere.

I agree about not having magically changing behavior and did argue against such behavior in the early revision of the patch.

This is not an issue here, AFAICT. Behavior changes depending on what the user specified explicitly during the build configuration. It’s not any different from enabling/disabling certain back-ends and having some tests running or not depending on that.

For the users who do not configure their build with ptxas enabled (and installed), there are no visible changes to the build and test. Users who do configure their builds to enable ptxas use in tests, do get extra testing done. This will eventually be used on CUDA buildbot where I will have CUDA SDK installed and will rerun LLVM tests with the additional coverage provided by these tests. However, having these tests conditionally available in the LLVM itself has the benefit that we will be able to catch errors before landing the changes.

Hmmmm disagree. The whole point of buildbots is to test things you don’t have locally. IMO, ptxas is just one more of these, just like most people don’t have Windows (with its entire set of proprietary tools).

Took another look. So this extra testing only runs if the user explicitly specifies -DPTXAS_EXECUTABLE to CMake, or sets the environment variable LLVM_PTXAS_EXECUTABLE? And there isn’t any automatic detection?

So the point of this would be to detect changes to the output that accidentally break ptxas. Given that the point of the NVPTX backend is specifically to interface with ptxas, I guess that makes sense. Not sure how much benefit you get out of this vs. just running some generic test-suite on the buildbot, though.

Maybe leave a short note in each test so it’s obvious to developers why their test is failing on the buildbot where it doesn’t fail locally. (And the ptxas run should not be treated as a substitute for writing reasonable FileCheck checks.)

The point of the regression tests, specifically, is to ensure that that compiler generates a known output for particular set of inputs. We try hard to make that mapping independent of the host for almost all tests.

What I really don’t want to see is a situation where some aspect of LLVM IR changes, and we have a bunch of regression tests that are impossible to update because they only run on specific hosts.

The way the current ptxas-enabled tests work, the assembler text goes through the usual FileCheck verification, and then if ptxas is enabled, it also runs. I agree ptxas should never be the only verification, and my understanding is that the folks working in that area also agree.

1 Like

I am less concerned about depending on a proprietary tool for some bots (as someone mentioned we have CUDA bots that require the proprietary CUDA drivers installed) but rather what we do when we hit issues with such tool.

For example what if a perfectly valid change to instCombine triggers a crash in ptxas in the test suite reported by a bot? Who is responsible for investigating and root causing this: the test suite owner / GPU developers or the author of the instCombine change?
If we’re just exposing a bug in ptxas that crashes a benchmark in the test-suite: reverting the change can be a temporary solution of course, but what is the path forward for the LLVM contributor?
It is in general a problem with any third-party dependencies, even though if for the non-proprietary ones we can try to patch them ourselves (which may be not practical anyway).

1 Like

Bots are supposed to have owners who should be willing to help out with issues that occur only on their bots. Is that not how it works?

There is a similar situation with regards to MLIR building for a FPGA target like XILINX. The final step from MLIR to FPGA netlist is proprietary.
So, I would probably tend to allow an option to use proprietary tools, but favor open source tools if they are available.

Yes, and yes. We already do a fairly decent job checking that LLVM generates what we, developers, expect to be a valid PTX and that is not going to go away. If anything, once ptxas test finds a problem, that specific problem would be a primary candidate for converting it into a FileCheck test and make the standard test set more robust.

Sure: but while a bot owner for PowerPC can help debugging a miscompile because they have access to the hardware, there is more to it here when you’re actually dependent of a buggy proprietary tool, the bot owner won’t tell you much more than “ptxas is crashing, it does not like this (otherwise valid) PTX”.

This is not different from the current state of affairs – we’re already depending on ptxas and other proprietary tools on the CUDA bots. E.g. we can get exactly the same problem now if a CUDA test-suite test would tickle something wrong in LLVM which would make ptxas fail.

Specifically in case of NVPTX and ptxas tests, it would make me, the owner of CUDA bots, extremely curious about what triggered the failure. If there’s a bug, it is very likely that it will bite my users. For what it’s worth, historically, we (Google) found most NVPTX and ptxas bugs on our own code base.

If it indeed turns out NVIDIA’s tool fault, I’ll file a bug with NVIDIA and would work on figuring out how to avoid triggering the bug, or implement a workaround in LLVM. The sad reality is that we do need to be able to work with ptxas versions that are already out there and will not get fixed.

This scenario actually makes it even more important to catch such issues early, before they hit the end users, who would likely encounter it on a much larger code, compared to LLVM tests, and that would make it much harder to reduce and debug.

1 Like

I agree that we should strive for making tests independent of the host (installed proprietary tools). Every introduction of a new proprietary tool must be prudent. The issue is that they are prone to lure new tests checking the wrong layer (action at a distance) and causing contribution barrier for others.

In this particular usage, for the NVPTX target, I agree that there are cases (test coverage gap) with no good replacement for additionally running the proprietary tool ptxas. The use seems fine if the reviewers pays extra attention on new usage (e.g. whether the ptxas invocation is necessary? does it check too much? how likely an updated ptxas will cause the test to adjust? could it cause routine trouble to people contributing general codegen components?). The bot maintainers need to be willing to help deal with issues only on their bots.

1 Like

In my humble opinion, the tests and the code that depends on a proprietary tool should live outside of the repo. If the absence of such a tool prevents the user from running all the test case available on the repo then it will cause issues for many users. For example a linux distribution might rely on the test cases to verify their pre-built package for the LLVM and such tools can easily break their workflows.

Given that I’m wearing most of the hats @MaskRay mentioned, I can address some of these points.

  • whether the ptxas invocation is necessary?
    For ptxas most of the tests are essentially pass/fail syntax checks.
    I would say that they would be essential for the tests where we specifically check generation of particularly hairy instructions (e.g. CodeGen/NVPTX/wmma.py). That said, any PTX produced by successful LLVM compilation (perhaps, modulo inline assembly) must be syntactically valid, so it would be beneficial to run it on all llc tests in NVPTX codegen, so we can cover wider variety of instructions.

  • does it check too much?
    ptxas is primarily used as pass/fail syntax checker. We do not need the code to make sense or to work, just to pass syntax checks.

  • how likely an updated ptxas will cause the test to adjust?
    Rarely, if ever. ptxas is backward-compatible and accepts older PTX syntax.
    It’s possible that at some point NVIDIA will deprecate old PTX versions, but they have not done it so far and do currently support really old versions that predate NVPTX back-end in LLVM.
    Most of the relevant changes will be the new tests for the new PTX versions introduced by new CUDA SDK versions, once we implement their support in clang and LLVM.

  • could it cause routine trouble to people contributing general codegen components?
    Unlikely. We’ve dealt with the old issues uncovered by these tests. Most likely scenario where the issues would pop up is addition of new NTPTX instructions or variants of already existing instructions and the authors would be expected to run these tests before landing the changes.

  • The bot maintainers need to be willing to help deal with issues only on their bots.
    I already do that for CUDA bots, so yes, I am willing to help.

1 Like

Please see above. These tests are conditional on the user explicitly enabling the tool during build configuration. Users who did not opt-in are not affected.

1 Like

Thanks for the reply. I just randomly picked some questions but appreciate that you took time to even answer them individually:)

I just realized the patch author asavonic is also the one who spent significant time on improving lit for the conditional statement support. I know very little about NVPTX but taking a look at D121727 ([NVPTX] Integrate ptxas to LIT tests) the additional RUN lines seem useful. I support the current form of D121727.