icecream - what should /proc/cpuinfo contain on remote machines?

Greetings, I’m the maintainer of icecream, and I’m trying to solve a bug some of my users are seeing: when using clang 4.0 their build fails because the file /proc/cpuinfo does not exist. I’m trying to figure out what the right solution to this is. See https://github.com/icecc/icecream/issues/176
From what we can tell this came in from: https://reviews.llvm.org/D25564.

A summary for those who don’t want to read the above: icecream is a distributed build tool that allows you to use other networked computers to speed up your builds. The local build machine where you run “make” ships the compiler to remote machines which then run the compiler in a chroot (this ensures that if the remote has a different version of clang installed we do not use that).

We see three options:

Place an empty /proc/cpuinfo in the chroot. This seems to work and it is easy. From what we can tell this file is only used at link time and icecream does not link on remote machines. However if clang starts to use /proc/cpuinfo for something else (asserts on valid contents) we are broke again.

When building the compiler package add /proc/cpuinfo from whatever machine is building the environment. /proc/cpuinfo may be bogus but at least it exists. This is easy to do, but it is a coincidence if the file is actually correct.

Copy /proc/cpuinfo into the chroot on each build machine. This is the most difficult to implement (test), and I’m not sure if it gains anything or not. Note in particular that icecream tries hard to use as many cores on the remote machines as possible without the user of the remote machine noticing a performance problem, thus clang using /proc/cpuinfo to spawn additional threads is probably something we do not want.

I’m leaning to the first because it is easy, but others have different opinions. After looking at all the responses I’ve concluded that so far nobody actually knows enough about what clang is doing (much less planning to do) to make a useful decision. Thus I’m reaching out to the clang community: what considerations are we not aware of, and is there a good reason to choose any plan?

This seems like an LLVM bug. I’d rather LLVM didn’t look in /proc, but I can’t find an alternative. In any case, we shouldn’t print to stderr for a successful build. Silently falling back to std::thread::hardware_concurrency() seems fine, which is what the code does.

How exactly are you calling clang?

Joerg

We call clang something like:

On the machine creating the environment.
mkdir -p tmp/usr/bin
cd tmp
cp 'which clang' usr/bin
# a lot of magic to figure out which shared libraries clang needs and copy them, and a bunch of other files clang needs
tar -zcf . ../MyEnvrionment.tgz

Then transfer this to the remote machine where we do:
mkdir /var/icecc/.../environment # actually make temp directory of some sort)
cd /var/icecc/.../environment
tar -xf /path/to/MyEnvironment.tgz
chroot .
/usr/bin/clang -x c - -o file.o -fpreprocessed -pipe -Xclang -main-file-name -Xclang [filename] -no-canonical-prefixes -fdebug-prefix-map=%s/=/ [something - only if dwarfFissionEnabled] [whatever other arguments the user supplied except for a few special ones we strip]

I'm probably missing something from the above, it is tricky to follow the entire chain of arguments.

The basic idea is to create a compiler package somewhere (might be the local machine, but if some of the remotes are a different CPU it gets complex). Then we transfer this environment to some remote machines. Then we run the preprocessor on the local build machine and transfer the preprocessed file to remote machines to compile (fed to stdin of clang), and then transfer the result back to the local machine. This allows running make with a large -j value for much faster builds (I often run -j80 on a 6 core machine)

This seems like an LLVM bug. I'd rather LLVM didn't look in /proc, but I
can't find an alternative. In any case, we shouldn't print to stderr for a
successful build. Silently falling back to std::thread::hardware_concurrency()
seems fine, which is what the code does.

Agreed, we should at least return -1 if we can't find /proc/cpuinfo (which
will result in fallback to std::thread::hardware_concurrency()). The
question I have is whether it should be silent or not - my understanding
was that this file should exist in x86_64 Linux.

But I am also wondering how you hit this, since reading through the github
thread you linked indicated that this is not LTO or ThinLTO. Oh, I see now
where it is being used to initialize the value of an option...if this is a
file that will not always exist for Linux x86_64 then it needs to fall back
silently.

Teresa

Looks like consensus is this is a bug not intended behavior so I wrote up bug 33008 on it. Thanks for the feedback.

I just realized, as Reid pointed out, the code already does return -1 when we hit this situation (which should cause fall back to std::thread::hardware_concurrency). This was fixed in https://reviews.llvm.org/D32032 (r300267).

So you would be getting an error message, but it should still fall back to the right behavior. Is it just the existence of the printed message that is causing you grief? It looks like you are actually getting a non-zero exit code, how is that happening?

Teresa

Looks like that is from April, while clang 4.0 was released in March(?). My users tend to stick to released versions.

Ah, so I should just merge that fix into 4.0.1 then? Let me look into that.

Teresa