Dealing with a corrupted /proc/self/exe link

Hi all,

I am in charge of the controlled introduction of clang into
our builds at my workplace. Since all our tools must run from
a ClearCase view for automatic dependency tracking, we have been
biten by a Linux bug, and readlink("/proc/self/exe", ...) gives
nonsensical results. So we need to introduce a configure option
for disallowing this method of executable discovery (the other
one works well).

Here is the patch:

Index: autoconf/configure.ac

Hi all,

I am in charge of the controlled introduction of clang into
our builds at my workplace. Since all our tools must run from
a ClearCase view for automatic dependency tracking, we have been
biten by a Linux bug, and readlink("/proc/self/exe", ...) gives
nonsensical results. So we need to introduce a configure option
for disallowing this method of executable discovery (the other
one works well).

Interesting, can you describe the linux bug? Are the kernel devs aware of it?

We often had reports about /proc/self/exe not working (and thus clang crashing)
in chrooted environments. It is possible to mount /proc into the chroot but this
seems to be missing from many setups. The code in LLVM that uses /proc/self/exe
returns an empty string on error which confuses clang.

I don't really like having an autoconf switch for this as long as you can determine
whether the result from /proc/self/exe is valid. When you're adding a fallback to
Path.inc anyways, why not just try reading /proc/self/exe first, and if it fails, use
your fallback? That would also fix the chroot problem.

- Ben

Benjamin Kramer wrote:

Hi all,

I am in charge of the controlled introduction of clang into
our builds at my workplace. Since all our tools must run from
a ClearCase view for automatic dependency tracking, we have been
biten by a Linux bug, and readlink("/proc/self/exe", ...) gives
nonsensical results. So we need to introduce a configure option
for disallowing this method of executable discovery (the other
one works well).

Interesting, can you describe the linux bug? Are the kernel devs aware of it?

It is fixed in newer RHEL kernels (>=6). What I know is that this is a
ClearCase VFS-related bug that fails to do a reverse mapping to obtain
the logical pathname from the real (into the backing store of ClearCase)
one.

Here is a bug report:
<http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6189256>

We often had reports about /proc/self/exe not working (and thus clang crashing)
in chrooted environments. It is possible to mount /proc into the chroot but this
seems to be missing from many setups. The code in LLVM that uses /proc/self/exe
returns an empty string on error which confuses clang.

There is no empty string for me, and the returned string is a real object
(bytewise identical to the real thing) :

$ cd <into a dynamic view>
$ cp /bin/ls .
$ ls -l /proc/self/exe
lrwxrwxrwx 1 ggreif ocs 0 Jul 13 21:27 /proc/self/exe -> /bin/ls
$ ./ls -l /proc/self/exe
lrwxrwxrwx 1 ggreif ocs 0 Jul 13 21:27 /proc/self/exe -> /vol/ocs_userviews25_13/ggreif-hc_stm-OCSnb28718.vws/.s/00056/800006ba4fdf647els

$ diff ./ls /vol/ocs_userviews25_13/ggreif-hc_stm-OCSnb28718.vws/.s/00056/800006ba4fdf647els
<no diffs>

Unfortunately starting from the clang executable, there is no useful
directory structure to be discovered :frowning:

I don't really like having an autoconf switch for this as long as you can determine
whether the result from /proc/self/exe is valid. When you're adding a fallback to
Path.inc anyways, why not just try reading /proc/self/exe first, and if it fails, use
your fallback? That would also fix the chroot problem.

This is not a chroot problem. As shown above, I do not get a valid clang path
to manipulate and discover include directories, etc.

The other method in lib/Support/Unix/Path.inc (i.e. dladdr, realpath) works.

I still maintain that I need the configure option.

Cheers,

  Gabor

PS: removed llvm-commits from the CC: list.

Sorry for being mean, but this is a workaround for a bug in the linux kernel that was
fixed years ago and is only visible when using an obscure revision control system.

Also it requires rebuilding LLVM, so the fix isn't even helpful without researching the
issue (if someone else hits it).

With this in mind I really don't see why this has to be in the public tree, requiring
additions to two build systems. Can't you just apply the one-line-patch to Path.inc
locally?

- Ben

I agree, this patch as is doesn't belong in the tree...

However, I suspect that Clang already hase the capability to solve this
problem for you.

For context, we run Clang in a distributed build environment not dissimilar
to the one you are describing, and for us as well /proc/self/exe does not
really help us locate the Clang binary. There is a switch available
(-no-canonical-prefixes) which in conjunction with some other things should
use the value of argv[0] in main to locate the clang binary, not
/proc/self/exe or anything else.

Can you describe why it is that Clang is reading /proc/self/exe? We might
be able to change that in a principled way to support numerous different
filesystem layouts and scenarios where its results are correct but not
helpful for locating executable-relative directory structures.

Chandler Carruth wrote:

    > Benjamin Kramer wrote:
    >>
    >>> Hi all,
    >>>
    >>> I am in charge of the controlled introduction of clang into
    >>> our builds at my workplace. Since all our tools must run from
    >>> a ClearCase view for automatic dependency tracking, we have been
    >>> biten by a Linux bug, and readlink("/proc/self/exe", ...) gives
    >>> nonsensical results. So we need to introduce a configure option
    >>> for disallowing this method of executable discovery (the other
    >>> one works well).
    >>
    >> Interesting, can you describe the linux bug? Are the kernel devs
    aware of it?
    >
    > It is fixed in newer RHEL kernels (>=6). What I know is that this is a
    > ClearCase VFS-related bug that fails to do a reverse mapping to obtain
    > the logical pathname from the real (into the backing store of
    ClearCase)
    > one.
    >
    > Here is a bug report:
    > <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6189256>
    >
    >>
    >> We often had reports about /proc/self/exe not working (and thus
    clang crashing)
    >> in chrooted environments. It is possible to mount /proc into the
    chroot but this
    >> seems to be missing from many setups. The code in LLVM that uses
    /proc/self/exe
    >> returns an empty string on error which confuses clang.
    >
    > There is no empty string for me, and the returned string is a real
    object
    > (bytewise identical to the real thing) :
    >
    > $ cd <into a dynamic view>
    > $ cp /bin/ls .
    > $ ls -l /proc/self/exe
    > lrwxrwxrwx 1 ggreif ocs 0 Jul 13 21:27 /proc/self/exe -> /bin/ls
    > $ ./ls -l /proc/self/exe
    > lrwxrwxrwx 1 ggreif ocs 0 Jul 13 21:27 /proc/self/exe ->
    /vol/ocs_userviews25_13/ggreif-hc_stm-OCSnb28718.vws/.s/00056/800006ba4fdf647els
    >
    > $ diff ./ls
    /vol/ocs_userviews25_13/ggreif-hc_stm-OCSnb28718.vws/.s/00056/800006ba4fdf647els
    > <no diffs>
    >
    > Unfortunately starting from the clang executable, there is no useful
    > directory structure to be discovered :frowning:
    >
    >>
    >> I don't really like having an autoconf switch for this as long as
    you can determine
    >> whether the result from /proc/self/exe is valid. When you're
    adding a fallback to
    >> Path.inc anyways, why not just try reading /proc/self/exe first,
    and if it fails, use
    >> your fallback? That would also fix the chroot problem.
    >
    > This is not a chroot problem. As shown above, I do not get a valid
    clang path
    > to manipulate and discover include directories, etc.
    >
    > The other method in lib/Support/Unix/Path.inc (i.e. dladdr,
    realpath) works.
    >
    > I still maintain that I need the configure option.

    Sorry for being mean, but this is a workaround for a bug in the
    linux kernel that was
    fixed years ago and is only visible when using an obscure revision
    control system.

    Also it requires rebuilding LLVM, so the fix isn't even helpful
    without researching the
    issue (if someone else hits it).

    With this in mind I really don't see why this has to be in the
    public tree, requiring
    additions to two build systems. Can't you just apply the
    one-line-patch to Path.inc
    locally?

I agree, this patch as is doesn't belong in the tree...

Hi Chandler,

yes, the audience is rather narrow (i.e. 'us' :slight_smile:

However, I suspect that Clang already hase the capability to solve this
problem for you.

Ok, good to hear.

For context, we run Clang in a distributed build environment not
dissimilar to the one you are describing, and for us as well
/proc/self/exe does not really help us locate the Clang binary. There is
a switch available (-no-canonical-prefixes) which in conjunction with
some other things should use the value of argv[0] in main to locate the
clang binary, not /proc/self/exe or anything else.

I shall read more on this in the code and experiment around a bit.
Is this way configurable, or a switch to clang? Clearly the former
would be better.

Can you describe why it is that Clang is reading /proc/self/exe? We
might be able to change that in a principled way to support numerous
different filesystem layouts and scenarios where its results are correct
but not helpful for locating executable-relative directory structures.

$ echo "int main(){return 0;}" > ttt.c
$ gdb Release+Asserts/bin/clang

Reading symbols from /home/ggreif/llvm/Release+Asserts/bin/clang...(no debugging symbols found)...done.

(gdb) b dladdr
Breakpoint 1 at 0x5d0d58

(gdb) run -c ttt.c
Starting program: /home/ggreif/llvm/Release+Asserts/bin/clang -c ttt.c
warning: no loadable sections found in added symbol-file system-supplied DSO at 0x2aaaaaaab000
[Thread debugging using libthread_db enabled]

Breakpoint 1, 0x0000003d61e01710 in dladdr () from /lib64/libdl.so.2
(gdb) bt
#0 0x0000003d61e01710 in dladdr () from /lib64/libdl.so.2
#1 0x00000000019d554d in llvm::sys::Path::GetMainExecutable(char const*, void*) ()
#2 0x00000000005d8882 in main ()

On your linux system you would set the breakpoint at 'readlink', but you
would see the same backtrace.

Thanks for your ideas,

cheers,

  Gabor

Chandler Carruth wrote:
>
>
>
> > Benjamin Kramer wrote:
> >>
> >>> Hi all,
> >>>
> >>> I am in charge of the controlled introduction of clang into
> >>> our builds at my workplace. Since all our tools must run from
> >>> a ClearCase view for automatic dependency tracking, we have been
> >>> biten by a Linux bug, and readlink("/proc/self/exe", ...) gives
> >>> nonsensical results. So we need to introduce a configure option
> >>> for disallowing this method of executable discovery (the other
> >>> one works well).
> >>
> >> Interesting, can you describe the linux bug? Are the kernel devs
> aware of it?
> >
> > It is fixed in newer RHEL kernels (>=6). What I know is that this
is a
> > ClearCase VFS-related bug that fails to do a reverse mapping to
obtain
> > the logical pathname from the real (into the backing store of
> ClearCase)
> > one.
> >
> > Here is a bug report:
> > <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6189256>
> >
> >>
> >> We often had reports about /proc/self/exe not working (and thus
> clang crashing)
> >> in chrooted environments. It is possible to mount /proc into the
> chroot but this
> >> seems to be missing from many setups. The code in LLVM that uses
> /proc/self/exe
> >> returns an empty string on error which confuses clang.
> >
> > There is no empty string for me, and the returned string is a real
> object
> > (bytewise identical to the real thing) :
> >
> > $ cd <into a dynamic view>
> > $ cp /bin/ls .
> > $ ls -l /proc/self/exe
> > lrwxrwxrwx 1 ggreif ocs 0 Jul 13 21:27 /proc/self/exe -> /bin/ls
> > $ ./ls -l /proc/self/exe
> > lrwxrwxrwx 1 ggreif ocs 0 Jul 13 21:27 /proc/self/exe ->
>
/vol/ocs_userviews25_13/ggreif-hc_stm-OCSnb28718.vws/.s/00056/800006ba4fdf647els
> >
> > $ diff ./ls
>
/vol/ocs_userviews25_13/ggreif-hc_stm-OCSnb28718.vws/.s/00056/800006ba4fdf647els
> > <no diffs>
> >
> > Unfortunately starting from the clang executable, there is no
useful
> > directory structure to be discovered :frowning:
> >
> >>
> >> I don't really like having an autoconf switch for this as long as
> you can determine
> >> whether the result from /proc/self/exe is valid. When you're
> adding a fallback to
> >> Path.inc anyways, why not just try reading /proc/self/exe first,
> and if it fails, use
> >> your fallback? That would also fix the chroot problem.
> >
> > This is not a chroot problem. As shown above, I do not get a valid
> clang path
> > to manipulate and discover include directories, etc.
> >
> > The other method in lib/Support/Unix/Path.inc (i.e. dladdr,
> realpath) works.
> >
> > I still maintain that I need the configure option.
>
> Sorry for being mean, but this is a workaround for a bug in the
> linux kernel that was
> fixed years ago and is only visible when using an obscure revision
> control system.
>
> Also it requires rebuilding LLVM, so the fix isn't even helpful
> without researching the
> issue (if someone else hits it).
>
> With this in mind I really don't see why this has to be in the
> public tree, requiring
> additions to two build systems. Can't you just apply the
> one-line-patch to Path.inc
> locally?
>
>
> I agree, this patch as is doesn't belong in the tree...

Hi Chandler,

yes, the audience is rather narrow (i.e. 'us' :slight_smile:

>
> However, I suspect that Clang already hase the capability to solve this
> problem for you.

Ok, good to hear.

>
> For context, we run Clang in a distributed build environment not
> dissimilar to the one you are describing, and for us as well
> /proc/self/exe does not really help us locate the Clang binary. There is
> a switch available (-no-canonical-prefixes) which in conjunction with
> some other things should use the value of argv[0] in main to locate the
> clang binary, not /proc/self/exe or anything else.

I shall read more on this in the code and experiment around a bit.
Is this way configurable, or a switch to clang? Clearly the former
would be better.

It's a flag to Clang. I really dislike configure switches, and generally
push for Clang to avoid them when at all possible. It makes both testing
and supporting users much easier.

In particular, as the only groups to truly need this behavior are build
systems which manage the file content trees specially, it seems reasonable
for those build systems to pass the appropriate flags to Clang.

I gave you the flag name above, so please give it a spin.

>
> Can you describe why it is that Clang is reading /proc/self/exe? We
> might be able to change that in a principled way to support numerous
> different filesystem layouts and scenarios where its results are correct
> but not helpful for locating executable-relative directory structures.

$ echo "int main(){return 0;}" > ttt.c
$ gdb Release+Asserts/bin/clang

Reading symbols from /home/ggreif/llvm/Release+Asserts/bin/clang...(no
debugging symbols found)...done.

Err, could you use a debug build please? =[ The information below doesn't
help much because...

(gdb) b dladdr
Breakpoint 1 at 0x5d0d58

(gdb) run -c ttt.c
Starting program: /home/ggreif/llvm/Release+Asserts/bin/clang -c ttt.c
warning: no loadable sections found in added symbol-file system-supplied
DSO at 0x2aaaaaaab000
[Thread debugging using libthread_db enabled]

Breakpoint 1, 0x0000003d61e01710 in dladdr () from /lib64/libdl.so.2
(gdb) bt
#0 0x0000003d61e01710 in dladdr () from /lib64/libdl.so.2
#1 0x00000000019d554d in llvm::sys::Path::GetMainExecutable(char const*,
void*) ()
#2 0x00000000005d8882 in main ()

... main doesn't call GetMainExecutable. Inlining and a bunch of other
stuff has happened here.

Anyways, I know this code. You could probably find it yourself. If you add
line numbers to your build, you'll get a stack trace pointing you to
tools/driver/driver.cpp:56 here, where we call GetMainExecutable. If you
read lines 50 and 51, you'll see the logic I described where if
-no-canonical-prefixes is used, we instead trust argv[0] (spelled by a
different name, look at the caller to see the gory details).

But there are a *lot* of ways that Clang will misbehave when run in a
heavily symlinked (or equivalent synthetic VFS) tree unless you pass this
flag. That's why it exists in both Clang and GCC. Let me know if you still
see trouble when using it.