clangd/libclang: how to emulate other compilers?

Hey all,

how does clangd or other users of the libclang handle situations where you
want to parse code that is dependent on a certain other compiler or compiler
environment? The most common scenario being embedded projects that rely on the
compiler-builtin defines and include paths to find the sysroot include paths
and such.

For KDevelop, which is using libclang, we have tried to build a sort of
emulation layer that originally yielded good results. The approach is as
followed:

1) We use the actual compiler that is used to compile a given project, e.g.
gcc, arm-none-eabi-gcc, ...

2) We take this compiler and query it for its builtin defines:
/usr/bin/gcc -xc++ -std=c++11 -dM -E - < /dev/null

3) And also query the include paths:
/usr/bin/gcc -xc++ -std=c++11 -v -E - < /dev/null

4) Then for the libclang calls to clang_parseTranslationUnit2 we pass `-
nostdinc -nostdinc++` followed by the defines and includes we got from 2) and
3).

Now, for simply things this actually worked quite well. But once you include a
file that heavily relies on the compiler, such as all the SIMD intrinsic
headers, you are easily drowning in parse errors. And once you have too many
parse errors, clang will just give up. We have tried to workaround this via
compatibility headers such as [1], but it keeps breaking.

More recently, we now also got bug reports where the user system has clang3
and they use that to to compile the code, but then download a KDevelop
AppImage built against libclang v5 (e.g. via AppImage). Once again this easily
yields tons of parse errors when encountering system headers that are using
intrinsics specific to clang v3.

I am now thinking about removing the emulation layer described above. But then
it will be essentially impossible to work on a lot of embedded projects which
rely on the cross compiler defines and include paths...

So, once again - how do other users of libclang handle this scenario? What is
the plan for clangd in this regard?

Thanks

Eclipse CDT does something similar. For each unique collection of compiler command line arguments (i.e. for files that build with the same options) we add -E -P -v -dD and call the real compiler to get the list of all include paths and symbols used for a given parse, including both user and built-in ones. Works with both clang and gcc.

For compiler specifics, we cheat and add macros that work us through the parse errors. You likely lose info but at least it rounds up to a pretty good parse.

I'm at the very beginning of trying to figure out how we'd do something similar with clangd. This feature will be a must for users with embedded systems toolchains which are mainly based on gcc.

Doug.

Eclipse CDT does something similar. For each unique collection of compiler
command line arguments (i.e. for files that build with the same options) we
add -E -P -v -dD and call the real compiler to get the list of all include
paths and symbols used for a given parse, including both user and built-in
ones. Works with both clang and gcc.

For compiler specifics, we cheat and add macros that work us through the
parse errors. You likely lose info but at least it rounds up to a pretty
good parse.

Ah, so you also have something like this [1] which I forgot to reference in my
initial email:

https://github.com/KDE/kdevelop/blob/master/plugins/clang/duchain/gcc_compat.h

If so, can you point me to your version of that file? If you are also using
this approach, and it works for you, then I guess we should share resources
and both work on a common file.

Out of interest: Have you ever tried to parse old clang headers with a CDT
built against a newer clang?

Hey all,

how does clangd or other users of the libclang handle situations where you
want to parse code that is dependent on a certain other compiler or compiler
environment? The most common scenario being embedded projects that rely on the
compiler-builtin defines and include paths to find the sysroot include paths
and such.

I’m not sure I understand what you mean - do you mean the compiler has builtins that clang doesn’t provide and relies on their existence?

Generally, you’ll want to use the builtin defines and includes from clang (at the point at which you compiled libclang), but the standard library and so forth that the system is using. Clang should be able to find that given the right flags.

Hey all,

how does clangd or other users of the libclang handle situations where you
want to parse code that is dependent on a certain other compiler or compiler
environment? The most common scenario being embedded projects that rely on the
compiler-builtin defines and include paths to find the sysroot include paths
and such.

I'm not sure I understand what you mean - do you mean the compiler has builtins that clang doesn't provide and relies on their existence?
Yes.
Generally, you'll want to use the builtin defines and includes from clang (at the point at which you compiled libclang), but the standard library and so forth that the system is using. Clang should be able to find that given the right flags.

Well, no. The plan for a lot of us is to use clangd with projects that use gcc as the compiler. We need to be able to reach out to gcc to ask it what the built-ins are. We’ll then need to convince clang to parse in the same manner. Given all the variants of compilers that we need to be able to support in our various IDE’s, it’s already something we’ve gotten quite use to.

For KDevelop, which is using libclang, we have tried to build a sort of
emulation layer that originally yielded good results. The approach is as
followed:

1) We use the actual compiler that is used to compile a given project, e.g.
gcc, arm-none-eabi-gcc, ...

2) We take this compiler and query it for its builtin defines:
/usr/bin/gcc -xc++ -std=c++11 -dM -E - < /dev/null

3) And also query the include paths:
/usr/bin/gcc -xc++ -std=c++11 -v -E - < /dev/null

4) Then for the libclang calls to clang_parseTranslationUnit2 we pass `-
nostdinc -nostdinc++` followed by the defines and includes we got from 2) and
3).

Now, for simply things this actually worked quite well. But once you include a
file that heavily relies on the compiler, such as all the SIMD intrinsic
headers, you are easily drowning in parse errors. And once you have too many
parse errors, clang will just give up. We have tried to workaround this via
compatibility headers such as [1], but it keeps breaking.

More recently, we now also got bug reports where the user system has clang3
and they use that to to compile the code, but then download a KDevelop
AppImage built against libclang v5 (e.g. via AppImage). Once again this easily
yields tons of parse errors when encountering system headers that are using
intrinsics specific to clang v3.

I am now thinking about removing the emulation layer described above. But then
it will be essentially impossible to work on a lot of embedded projects which
rely on the cross compiler defines and include paths...

So, once again - how do other users of libclang handle this scenario? What is
the plan for clangd in this regard?

Thanks

From: Milian Wolff [mailto:mail@milianw.de]
Sent: Tuesday, April 17, 2018 6:20 PM
To: cfe-dev@lists.llvm.org; Doug Schaefer <dschaefer@blackberry.com>
Subject: Re: [cfe-dev] clangd/libclang: how to emulate other compilers?

> Eclipse CDT does something similar. For each unique collection of
> compiler command line arguments (i.e. for files that build with the
> same options) we add -E -P -v -dD and call the real compiler to get
> the list of all include paths and symbols used for a given parse,
> including both user and built-in ones. Works with both clang and gcc.
>
> For compiler specifics, we cheat and add macros that work us through
> the parse errors. You likely lose info but at least it rounds up to a
> pretty good parse.

Ah, so you also have something like this [1] which I forgot to reference in my
initial email:

https://github.com/KDE/kdevelop/blob/master/plugins/clang/duchain/gcc_com
pat.h

If so, can you point me to your version of that file? If you are also using this
approach, and it works for you, then I guess we should share resources and both
work on a common file.

Now putting them in a include file would have been smart. Unfortunately, we have them hardcoded in the CDT source and add them directly to scanner info.

If you have the CDT source checked out it's all handled in classes that implement IScannerExtensionConfiguration.

Out of interest: Have you ever tried to parse old clang headers with a CDT built
against a newer clang?

CDT has it's own parsers. That's why we're looking at clangd to replace all that and leverage clang's parser. And it lets us future proof thanks to the LSP.

Coverity also does something similar for compatibility with different
Clang releases and distributions.

The most significant problems we run into are differences in builtin
functions (either signature changes or added/removed builtins) and
differences in recognized compiler options (rarely option syntax
changes, generally unrecognized options).

For builtins, we audit differences in the Builtins*.def files across
releases and distributions (assuming we can get access to the source for
the distribution) and define macros that, based on compiler
identification macros (__clang_major__, __clang_minor__,
__apple_build_version__, etc...) provide support for missing builtins or
incompatible builtin signatures.

For compiler options, we do similar audits of the (generated)
Options.inc header and have support for mapping unrecognized options to
recognized ones or just dropping them in hopes that the option doesn't
affect the recognized language dialect.

These audits are fairly straight forward to script for Clang.
Unfortunately, I'm not at liberty to share our scripts.

We've often wished for the ability to do similar audits for gcc, but gcc
doesn't define builtins in a consistent manner.

Tom.

> Hey all,
>
> how does clangd or other users of the libclang handle situations where you
> want to parse code that is dependent on a certain other compiler or
> compiler
> environment? The most common scenario being embedded projects that rely on
> the
> compiler-builtin defines and include paths to find the sysroot include
> paths
> and such.

I'm not sure I understand what you mean - do you mean the compiler has
builtins that clang doesn't provide and relies on their existence?

Take this example code:

#ifndef __arm__
#error unsupported platform
#endif

#include <foobar.h>

static_assert(sizeof(void*) == 4);

How can I parse this with libclang, such that it emulates my arm-none-eabi-
gcc?

- __arm__ should be defined, but not __x86_64__
- foobar.h should be found in the default include paths for the arm-none-eabi-
gcc compiler, not in the default include paths of libclang
- it should be 32bit by default

Now, we can get there to some degree via -nostdinc and -nostdinc++. But once
you do that for all compilers, you get into nasty issues when you replace the
libclang builtin headers with headers from a different clang version or even a
different compiler like GCC. They will be compiler specific and not portable,
thus not parsable by libclang.

Generally, you'll want to use the builtin defines and includes from clang
(at the point at which you compiled libclang), but the standard library and
so forth that the system is using. Clang should be able to find that given
the right flags.

Can you tell us what the right flags would be? I just looked at the man page
again and found -nostdlibinc, which may resolve this partially - I'll check.

Cheers

> > Hey all,
> >
> > how does clangd or other users of the libclang handle situations where
> > you
> > want to parse code that is dependent on a certain other compiler or
> > compiler
> > environment? The most common scenario being embedded projects that rely
> > on
> > the
> > compiler-builtin defines and include paths to find the sysroot include
> > paths
> > and such.
>
> I'm not sure I understand what you mean - do you mean the compiler has
> builtins that clang doesn't provide and relies on their existence?

Take this example code:

#ifndef __arm__
#error unsupported platform
#endif

#include <foobar.h>

static_assert(sizeof(void*) == 4);

How can I parse this with libclang, such that it emulates my arm-none-eabi-
gcc?

- __arm__ should be defined, but not __x86_64__
- foobar.h should be found in the default include paths for the
arm-none-eabi- gcc compiler, not in the default include paths of libclang
- it should be 32bit by default

Now, we can get there to some degree via -nostdinc and -nostdinc++. But once
you do that for all compilers, you get into nasty issues when you replace
the libclang builtin headers with headers from a different clang version or
even a different compiler like GCC. They will be compiler specific and not
portable, thus not parsable by libclang.

An example for this:

```
$ cat test.cpp
#include <x86intrin.h>

int main() { return 0; }

$ gcc -v -E - < /dev/null
..
#include "..." search starts here:
#include <...> search starts here:
/usr/lib/gcc/x86_64-pc-linux-gnu/7.3.1/include
/usr/local/include
/usr/lib/gcc/x86_64-pc-linux-gnu/7.3.1/include-fixed
/usr/include

$ clang -nostdinc -xc++ -isystem /usr/lib/gcc/x86_64-pc-linux-gnu/7.3.1/
include -isystem/usr/local/include -isystem/usr/lib/gcc/x86_64-pc-linux-gnu/
7.3.1/include-fixed -isystem/usr/include test.cpp
In file included from test.cpp:1:
In file included from /usr/lib/gcc/x86_64-pc-linux-gnu/7.3.1/include/
x86intrin.h:27:
/usr/lib/gcc/x86_64-pc-linux-gnu/7.3.1/include/ia32intrin.h:41:10: error: use
of undeclared identifier '__builtin_ia32_bsrsi'
  return __builtin_ia32_bsrsi (__X);
..
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

> Generally, you'll want to use the builtin defines and includes from clang
> (at the point at which you compiled libclang), but the standard library
> and
> so forth that the system is using. Clang should be able to find that given
> the right flags.

Can you tell us what the right flags would be? I just looked at the man page
again and found -nostdlibinc, which may resolve this partially - I'll
check.

I tried it with the example above but it doesn't seem to make any
difference... The reason seems to be that we explicitly add the path for the
GCC builtin headers to the system include path, and that then takes precedence
over the libclang provided ones. Now we could try to blacklist the paths for
the GCC builtin headers, but we'll need heuristics to find them. If I remove
"/usr/lib/gcc/x86_64-pc-linux-gnu/7.3.1/include" above, -nostdlibinc seems to
help.

Bye

This does seem to work, and sounds better to what we did before. Can someone
chime in to say whether this is an acceptable approach?

I would still be interested in learning how other tools that use libclang or
clangd are handling this...

Thanks

Hey all,

how does clangd or other users of the libclang handle situations where you
want to parse code that is dependent on a certain other compiler or
compiler
environment? The most common scenario being embedded projects that rely on
the
compiler-builtin defines and include paths to find the sysroot include
paths
and such.

I’m not sure I understand what you mean - do you mean the compiler has
builtins that clang doesn’t provide and relies on their existence?

Take this example code:

#ifndef __arm__
#error unsupported platform
#endif

#include <foobar.h>

static_assert(sizeof(void*) == 4);

How can I parse this with libclang, such that it emulates my arm-none-eabi-
gcc?

  • arm should be defined, but not x86_64
  • foobar.h should be found in the default include paths for the arm-none-eabi-
    gcc compiler, not in the default include paths of libclang
  • it should be 32bit by default

Using clang -target arm-eabi seems to do the trick?

Now, we can get there to some degree via -nostdinc and -nostdinc++. But once
you do that for all compilers, you get into nasty issues when you replace the
libclang builtin headers with headers from a different clang version or even a
different compiler like GCC. They will be compiler specific and not portable,
thus not parsable by libclang.

Generally, you’ll want to use the builtin defines and includes from clang
(at the point at which you compiled libclang), but the standard library and
so forth that the system is using. Clang should be able to find that given
the right flags.

Can you tell us what the right flags would be? I just looked at the man page
again and found -nostdlibinc, which may resolve this partially - I’ll check.

Again, there are multiple levels of builtin includes. If you want the right ones for a target platform, you’ll need to select that target platform - if that doesn’t work with clangd, we need to fix it :slight_smile:

Clang ships with x86intrin.h - what’s the problem using clang’s version of it?

Do you actually want language extensions of GCC that clang does not also ship with support for?

From: cfe-dev [mailto:cfe-dev-bounces@lists.llvm.org] On Behalf Of Manuel Klimek via cfe-dev
Sent: Wednesday, April 18, 2018 6:26 AM
To: Milian Wolff <mail@milianw.de>
Cc: Clang Dev <cfe-dev@lists.llvm.org>
Subject: Re: [cfe-dev] clangd/libclang: how to emulate other compilers?

Hey all,

how does clangd or other users of the libclang handle situations where you
want to parse code that is dependent on a certain other compiler or compiler
environment? The most common scenario being embedded projects that rely on the
compiler-builtin defines and include paths to find the sysroot include paths
and such.

I’m not sure I understand what you mean - do you mean the compiler has builtins that clang doesn’t provide and relies on their existence?

Yes.

Generally, you’ll want to use the builtin defines and includes from clang (at the point at which you compiled libclang), but the standard library and so forth that the system is using. Clang should be able to find that given the right flags.

Well, no. The plan for a lot of us is to use clangd with projects that use gcc as the compiler. We need to be able to reach out to gcc to ask it what the built-ins are. We’ll then need to convince clang to parse in the same manner. Given all the variants of compilers that we need to be able to support in our various IDE’s, it’s already something we’ve gotten quite use to.

If you want to use intrinsics where they differ between compilers, you’ll still need to use clang’s builtin headers, otherwise clang will incorrectly parse, well, basically anything.

The problem is that there is no layering; builtin headers depend on each other and compiler internals somewhat randomly.

We’d like to better understand examples of differences in language extensions, as clang tries really hard to emulate all the different dialects out there :slight_smile:

> > > Hey all,
> > >
> > > how does clangd or other users of the libclang handle situations where
>
> you
>
> > > want to parse code that is dependent on a certain other compiler or
> > > compiler
> > > environment? The most common scenario being embedded projects that
>
> rely on
>
> > > the
> > > compiler-builtin defines and include paths to find the sysroot include
> > > paths
> > > and such.
> >
> > I'm not sure I understand what you mean - do you mean the compiler has
> > builtins that clang doesn't provide and relies on their existence?
>
> Take this example code:
>
> ```
> #ifndef __arm__
> #error unsupported platform
> #endif
>
> #include <foobar.h>
>
> static_assert(sizeof(void*) == 4);
> ```
>
> How can I parse this with libclang, such that it emulates my
> arm-none-eabi-
> gcc?
>
>
> - __arm__ should be defined, but not __x86_64__
> - foobar.h should be found in the default include paths for the
> arm-none-eabi-
> gcc compiler, not in the default include paths of libclang
> - it should be 32bit by default

Using clang -target arm-eabi seems to do the trick?

It doesn't for me:

$ arm-none-eabi-gcc -E -v - < /dev/null
..
Target: arm-none-eabi
..
#include <...> search starts here:
 /usr/lib/gcc/arm-none-eabi/7.3.0/include
 /usr/lib/gcc/arm-none-eabi/7.3.0/include-fixed
 /usr/lib/gcc/arm-none-eabi/7.3.0/../../../../arm-none-eabi/include
$ clang -target arm-none-eabi -E -v - < /dev/null
..
Target: arm-none--eabi
..
#include <...> search starts here:
 /usr/lib/clang/6.0.0/include

So we still have to specify the include paths. But if we pass /usr/lib/gcc/
arm-none-eabi/7.3.0/include as an include path, then clang will use the
x86intrin.h from there, which it won't grog - it's highly GCC specific.

<snip>

Again, there are multiple levels of builtin includes. If you want the right
ones for a target platform, you'll need to select that target platform - if
that doesn't work with clangd, we need to fix it :slight_smile:

I don't think that setting the target on clang itself is sufficient. We also
don't want the IDE user to configure clang manually to emulate a compiler. We
want him to select a compiler (or find that automatically from the build
system) and then configure clang for him.

One more example:

$ wget https://releases.linaro.org/components/toolchain/binaries/latest/arm-linux-gnueabihf/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf.tar.xz

$ unp gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf.tar.xz

$ ./gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/bin/arm-linux-
gnueabihf-gcc -E -v - < /dev/null
..
Target: arm-linux-gnueabihf
..
 /home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/../lib/gcc/arm-linux-gnueabihf/7.2.1/include
 /home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/../lib/gcc/arm-linux-gnueabihf/7.2.1/include-fixed
 /home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/../lib/gcc/arm-linux-gnueabihf/7.2.1/../../../../arm-linux-gnueabihf/
include
 /home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/../arm-linux-gnueabihf/libc/usr/include
..

$ touch ./gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/bin/../arm-
linux-gnueabihf/libc/usr/include/please_find_this.h

$ cat arm_test.cpp
#ifndef __arm__
#error unsupported platform
#endif

#include <arm_neon.h>
#include <please_find_this.h>

static_assert(sizeof(void*) == 4);

int main() { return 0; }

$ ./gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/bin/arm-linux-
gnueabihf-g++ arm_test.cpp

$ clang++ -target arm-linux-gnueabihf arm_test.cpp
#error "NEON support not enabled"
..
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.

So clearly, just specifying the target isn't enough. It's also a question of
the other defaults for the compiler in question...

Hey all,

how does clangd or other users of the libclang handle situations where

you

want to parse code that is dependent on a certain other compiler or
compiler
environment? The most common scenario being embedded projects that

rely on

the
compiler-builtin defines and include paths to find the sysroot include
paths
and such.

I’m not sure I understand what you mean - do you mean the compiler has
builtins that clang doesn’t provide and relies on their existence?

Take this example code:

#ifndef __arm__
#error unsupported platform
#endif

#include <foobar.h>

static_assert(sizeof(void*) == 4);

How can I parse this with libclang, such that it emulates my
arm-none-eabi-
gcc?

  • arm should be defined, but not x86_64
  • foobar.h should be found in the default include paths for the
    arm-none-eabi-
    gcc compiler, not in the default include paths of libclang
  • it should be 32bit by default

Using clang -target arm-eabi seems to do the trick?

It doesn’t for me:

$ arm-none-eabi-gcc -E -v - < /dev/null
..
Target: arm-none-eabi
..
#include <...> search starts here:
/usr/lib/gcc/arm-none-eabi/7.3.0/include
/usr/lib/gcc/arm-none-eabi/7.3.0/include-fixed
/usr/lib/gcc/arm-none-eabi/7.3.0/../../../../arm-none-eabi/include
$ clang -target arm-none-eabi -E -v - < /dev/null
..
Target: arm-none--eabi
..
#include <...> search starts here:
/usr/lib/clang/6.0.0/include

So we still have to specify the include paths. But if we pass /usr/lib/gcc/
arm-none-eabi/7.3.0/include as an include path, then clang will use the
x86intrin.h from there, which it won’t grog - it’s highly GCC specific.

I’m not sure what’s not working - can you show a code example that does not work with clang -target arm-eabi?

Again, there are multiple levels of builtin includes. If you want the right
ones for a target platform, you’ll need to select that target platform - if
that doesn’t work with clangd, we need to fix it :slight_smile:

I don’t think that setting the target on clang itself is sufficient. We also
don’t want the IDE user to configure clang manually to emulate a compiler. We
want him to select a compiler (or find that automatically from the build
system) and then configure clang for him.

Unrelated but I think important for inclusiveness of the community: English has singular they (https://en.wikipedia.org/wiki/Singular_they), which is a nicely inclusive way to talk about indeterminate users :slight_smile:

I do get that you don’t want folks to need to configure clang to emulate a compiler. Generally, clang tries to do that itself, and I’m told it’s very close to GCC and MSVC. For compilers that clang can emulate, the right way is usually to translate flags to the other compiler into flags that clang understands, like selecting the right target.

That said, Clang will not be able to support all language extensions random compilers have, which is why we’re interested in real world example code to better understand what wide spread differences there are.

> > > > > Hey all,
> > > > >
> > > > > how does clangd or other users of the libclang handle situations
>
> where
>
> > > you
> > >
> > > > > want to parse code that is dependent on a certain other compiler
> > > > > or
> > > > > compiler
> > > > > environment? The most common scenario being embedded projects that
> > >
> > > rely on
> > >
> > > > > the
> > > > > compiler-builtin defines and include paths to find the sysroot
>
> include
>
> > > > > paths
> > > > > and such.
> > > >
> > > > I'm not sure I understand what you mean - do you mean the compiler
>
> has
>
> > > > builtins that clang doesn't provide and relies on their existence?
> > >
> > > Take this example code:
> > >
> > > ```
> > > #ifndef __arm__
> > > #error unsupported platform
> > > #endif
> > >
> > > #include <foobar.h>
> > >
> > > static_assert(sizeof(void*) == 4);
> > > ```
> > >
> > > How can I parse this with libclang, such that it emulates my
> > > arm-none-eabi-
> > > gcc?
> > >
> > >
> > > - __arm__ should be defined, but not __x86_64__
> > > - foobar.h should be found in the default include paths for the
> > > arm-none-eabi-
> > > gcc compiler, not in the default include paths of libclang
> > > - it should be 32bit by default
> >
> > Using clang -target arm-eabi seems to do the trick?
>
> It doesn't for me:
>
> ```
> $ arm-none-eabi-gcc -E -v - < /dev/null
> ..
> Target: arm-none-eabi
> ..
>
> #include <...> search starts here:
> /usr/lib/gcc/arm-none-eabi/7.3.0/include
> /usr/lib/gcc/arm-none-eabi/7.3.0/include-fixed
> /usr/lib/gcc/arm-none-eabi/7.3.0/../../../../arm-none-eabi/include
>
> $ clang -target arm-none-eabi -E -v - < /dev/null
> ..
> Target: arm-none--eabi
> ..
>
> #include <...> search starts here:
> /usr/lib/clang/6.0.0/include
>
> ```
>
> So we still have to specify the include paths. But if we pass
> /usr/lib/gcc/
> arm-none-eabi/7.3.0/include as an include path, then clang will use the
> x86intrin.h from there, which it won't grog - it's highly GCC specific.

I'm not sure what's not working - can you show a code example that does not
work with clang -target arm-eabi?

Please see the example based on the linaro toolchain I have given futher
below.

> <snip>
>
> > Again, there are multiple levels of builtin includes. If you want the
>
> right
>
> > ones for a target platform, you'll need to select that target platform -
>
> if
>
> > that doesn't work with clangd, we need to fix it :slight_smile:
>
> I don't think that setting the target on clang itself is sufficient. We
> also
> don't want the IDE user to configure clang manually to emulate a compiler.
> We
> want him to select a compiler (or find that automatically from the build
> system) and then configure clang for him.

Unrelated but I think important for inclusiveness of the community: English
has singular they (https://en.wikipedia.org/wiki/Singular_they), which is a
nicely inclusive way to talk about indeterminate users :slight_smile:

TIL, thanks for the suggestion.

I do get that you don't want folks to need to configure clang to emulate a
compiler. Generally, clang tries to do that itself, and I'm told it's very
close to GCC and MSVC. For compilers that clang can emulate, the right way
is usually to translate flags to the other compiler into flags that clang
understands, like selecting the right target.

That said, Clang will not be able to support all language extensions random
compilers have, which is why we're interested in real world example code to
better understand what wide spread differences there are.

Right, I have given such an example below:

Hey all,

how does clangd or other users of the libclang handle situations

where

you

want to parse code that is dependent on a certain other compiler
or
compiler
environment? The most common scenario being embedded projects that

rely on

the
compiler-builtin defines and include paths to find the sysroot

include

paths
and such.

I’m not sure I understand what you mean - do you mean the compiler

has

builtins that clang doesn’t provide and relies on their existence?

Take this example code:

#ifndef __arm__
#error unsupported platform
#endif

#include <foobar.h>

static_assert(sizeof(void*) == 4);

How can I parse this with libclang, such that it emulates my
arm-none-eabi-
gcc?

  • arm should be defined, but not x86_64
  • foobar.h should be found in the default include paths for the
    arm-none-eabi-
    gcc compiler, not in the default include paths of libclang
  • it should be 32bit by default

Using clang -target arm-eabi seems to do the trick?

It doesn’t for me:

$ arm-none-eabi-gcc -E -v - < /dev/null
..
Target: arm-none-eabi
..

#include <...> search starts here:
/usr/lib/gcc/arm-none-eabi/7.3.0/include
/usr/lib/gcc/arm-none-eabi/7.3.0/include-fixed
/usr/lib/gcc/arm-none-eabi/7.3.0/../../../../arm-none-eabi/include

$ clang -target arm-none-eabi -E -v - < /dev/null
..
Target: arm-none--eabi
..

#include <...> search starts here:
/usr/lib/clang/6.0.0/include

So we still have to specify the include paths. But if we pass
/usr/lib/gcc/
arm-none-eabi/7.3.0/include as an include path, then clang will use the
x86intrin.h from there, which it won’t grog - it’s highly GCC specific.

I’m not sure what’s not working - can you show a code example that does not
work with clang -target arm-eabi?

Please see the example based on the linaro toolchain I have given futher
below.

Again, there are multiple levels of builtin includes. If you want the

right

ones for a target platform, you’ll need to select that target platform -

if

that doesn’t work with clangd, we need to fix it :slight_smile:

I don’t think that setting the target on clang itself is sufficient. We
also
don’t want the IDE user to configure clang manually to emulate a compiler.
We
want him to select a compiler (or find that automatically from the build
system) and then configure clang for him.

Unrelated but I think important for inclusiveness of the community: English
has singular they (https://en.wikipedia.org/wiki/Singular_they), which is a
nicely inclusive way to talk about indeterminate users :slight_smile:

TIL, thanks for the suggestion.

I do get that you don’t want folks to need to configure clang to emulate a
compiler. Generally, clang tries to do that itself, and I’m told it’s very
close to GCC and MSVC. For compilers that clang can emulate, the right way
is usually to translate flags to the other compiler into flags that clang
understands, like selecting the right target.

That said, Clang will not be able to support all language extensions random
compilers have, which is why we’re interested in real world example code to
better understand what wide spread differences there are.

Right, I have given such an example below:

Ah, sorry, I missed the neon part. So this works if we add -mcpu=cortex-a15. The problem is that the gcc cross compiler apparently has that configured at build time, and I’m not sure what the best way is to figure out the full settings for cpu / fpu and float-abi, but I think we’ll need to find some way to gather them from the underlying cross compiler.

<snip>

Ah, sorry, I missed the neon part. So this works if we add
-mcpu=cortex-a15. The problem is that the gcc cross compiler apparently has
that configured at build time, and I'm not sure what the best way is to
figure out the full settings for cpu / fpu and float-abi, but I think we'll
need to find some way to gather them from the underlying cross compiler.

OK, thanks. This brings us one more step closer towards emulation and should
be sufficient for the example I have given. So, we now end up with these
steps:

#1: query the target compiler
$ ./gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/bin/arm-linux-
gnueabihf-gcc -xc++ -E -v -dM - < /dev/null

#2: parse the output for the target:
Target: arm-linux-gnueabihf

#3: parse the default march GCC argument from:
COLLECT_GCC_OPTIONS='-E' '-v' '-dM' '-march=armv7-a' '-mtune=cortex-a9' '-
mfloat-abi=hard' '-mfpu=vfpv3-d16' '-mthumb' '-mtls-dialect=gnu'

#4: parse the include paths:
#include <...> search starts here:
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/../lib/gcc/arm-linux-gnueabihf/7.2.1/include
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/../lib/gcc/arm-linux-gnueabihf/7.2.1/include-fixed
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/../lib/gcc/arm-linux-gnueabihf/7.2.1/../../../../arm-linux-gnueabihf/
include
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/../arm-linux-gnueabihf/libc/usr/include

#5: exclude the GCC builtin include path, i.e.:
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/../lib/gcc/arm-linux-gnueabihf/7.2.1/include

For now, this can be done by some heuristics, like checking whether the folder
contains a "varargs.h" file.

#6: parse the defines and write all into a file that starts with `#pragma
clang system_header`

#7 parse the file with clang:

clang -Xclang -ast-dump -fsyntax-only -xc++ \
  -target arm-linux-gnueabihf \ (from #2)
  -march=armv7-a \ (from #3)
  -isystem... \ (from #4, 5)
  -imacros... \ (from #6)
  file.cpp

This seems to work! Thanks for the help Manuel!

Ah, sorry, I missed the neon part. So this works if we add
-mcpu=cortex-a15. The problem is that the gcc cross compiler apparently has
that configured at build time, and I’m not sure what the best way is to
figure out the full settings for cpu / fpu and float-abi, but I think we’ll
need to find some way to gather them from the underlying cross compiler.

OK, thanks. This brings us one more step closer towards emulation and should
be sufficient for the example I have given. So, we now end up with these
steps:

#1: query the target compiler
$ ./gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/bin/arm-linux-
gnueabihf-gcc -xc++ -E -v -dM - < /dev/null

#2: parse the output for the target:
Target: arm-linux-gnueabihf

#3: parse the default march GCC argument from:
COLLECT_GCC_OPTIONS=’-E’ ‘-v’ ‘-dM’ ‘-march=armv7-a’ ‘-mtune=cortex-a9’ ‘-
mfloat-abi=hard’ ‘-mfpu=vfpv3-d16’ ‘-mthumb’ ‘-mtls-dialect=gnu’

#4: parse the include paths:
#include <…> search starts here:
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/…/lib/gcc/arm-linux-gnueabihf/7.2.1/include
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/…/lib/gcc/arm-linux-gnueabihf/7.2.1/include-fixed
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/…/lib/gcc/arm-linux-gnueabihf/7.2.1/…/…/…/…/arm-linux-gnueabihf/
include
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/…/arm-linux-gnueabihf/libc/usr/include

#5: exclude the GCC builtin include path, i.e.:
/home/milian/Downloads/gcc-linaro-7.2.1-2017.11-x86_64_arm-linux-gnueabihf/
bin/…/lib/gcc/arm-linux-gnueabihf/7.2.1/include

For now, this can be done by some heuristics, like checking whether the folder
contains a “varargs.h” file.

#6: parse the defines and write all into a file that starts with #pragma clang system_header

This might still break if you define things that the compiler wants to define on its own, so you might need a filter, but overall this looks reasonable now :slight_smile:

Afaik this shouldn't be an issue: If you don't add the `#pragma clang
system_header` line at the start of the imacro file, then you are swamped by
redefinition errors/warnings. The pragma just silences these, but are they
actually redefined or not? From what I've seen, they are redefined, and that's
fine from what I've seen so far...

Can you come up with a scenario where it would break?

Thanks