Disable #error?

Is it possible to disable #error directives? Either by a command line flag or my some kind of option. This is for a tool built on libclang.

Hi Jacob,

This is an interesting idea. It's always a hard error right now.

We've seen similar requests to 'downgrade' a few other errors like the MS inline assembly missing-backend one and I suspect there's a pattern developing here.

Could you expand on your specific use-case a little?

We need to get a clear idea of the kinds of tools and interfaces (libclang, tooling, refactoring?) that would benefit from such a soft-errors mode, as well as an idea which other errors might qualify, to develop a plan of action.

Alp.

I have a tool that translate C header files to D modules[1]. The tool is designed to translate header files one at the time. The problem is that in some C libraries some kind of umbrella headers are used. They only serve to include other sub header files. Some of these libraries enforce this by the sub header files checking for a preprocessor macro defined by the umbrella header. If this macro is not defined they halt the complication with the #error directive.

I'm using libclang since the tool itself is written in D. D is ABI compatible with C.

[1] https://github.com/jacob-carlborg/dstep

I have the exact same need for IWYU, but I want to use the #error
directive as a heuristic to figure out header mappings.

So if we see a private header with an #error directive, we could run a
regex over its message and find the umbrella header we should
recommend instead.

I'm almost sure it's not a reasonable way to do this, but my life
would be easier if there was a PPCallback method for #error where the
error message was passed as an argument, and I could signal back that
I wanted to continue processing without failing (e.g. through a bool).

I've also considered that maybe a DiagnosticConsumer could be hooked
up to see the #error messages, but I haven't really checked if it
would be possible. I don't think I could prevent #error from failing
anyway.

I don't know if either of these are available in libclang, though.

- Kim

Just do:

$ sed -i -e 's/#error.//’ **/.h

Really, ignoring the semantics of the program being analyzed (i.e. that it requests compilation to be aborted) is just as much of a hack as using sed to modify the headers. In both cases, you are forcefully trampling the source code’s request.

– Sean Silva

That's one way of looking at it.

But we're not really compiling (maybe Jacob is, in some sense); we're
doing static analysis and the #error directives could give us useful
data for analysis instead of just aborting.

Regex was actually my next plan of attack, we'll see if I ever get
around to it :slight_smile:

- Kim

That's one way of looking at it.

But we're not really compiling (maybe Jacob is, in some sense); we're
doing static analysis and the #error directives could give us useful
data for analysis instead of just aborting.

I don't think anyone would be against adding a callback to PPCallbacks to
indicate what the error message is so you can get the data. However, being
able to affect the outcome of compilation from a PPCallbacks callback seems
unwise; it would be kind of like if the #if callback could decide which
branch to take, which means explicitly violating the source code's meaning,
which a compiler shouldn't be doing!

For the purpose of simply collecting #error directives from TU's, it seems
like the simplest thing to do would be to use pp-trace (once there is a
#error callback in PPCallbacks) driven from a short script. You could even
do better than trying to extract the header name from the #error message:
just look for the dominating #ifdef and see where it is defined.

-- Sean Silva

I don't think anyone would be against adding a callback to PPCallbacks to
indicate what the error message is so you can get the data. However, being
able to affect the outcome of compilation from a PPCallbacks callback seems
unwise;

Yes, I agree that decision does not belong in PPCallbacks. But it's
tempting! :slight_smile:

Actually, now that I think about it, Jacob's scenario is the exact
opposite of mine: he seems to be parsing headers in isolation and I
will always see the private header via its umbrella header.

For me the #error will never trigger, but that also means I'll never
get a PPCallback for it. I just want to scan for it and use it to
connect the private header name to its umbrella.

For Jacob it triggers all the time, and he doesn't care about it.
Stripping out the #error before attempting to parse could be a
solution.

But it would sure be nice to be able to lean on Clang's parser.

For the purpose of simply collecting #error directives from TU's, it seems
like the simplest thing to do would be to use pp-trace (once there is a
#error callback in PPCallbacks) driven from a short script. You could even
do better than trying to extract the header name from the #error message:
just look for the dominating #ifdef and see where it is defined.

Good idea, I'll save that for later!

- Kim

That's one way of looking at it.

But we're not really compiling (maybe Jacob is, in some sense); we're
doing static analysis and the #error directives could give us useful
data for analysis instead of just aborting.

I'm doing source to source translation. I don't know if you consider that compiling. It doesn't generate any object or exectuable code.

Regex was actually my next plan of attack, we'll see if I ever get
around to it :slight_smile:

I would prefer to avoid that.

I think I actually need that as well :). I would like to be able to translate code like this:

#if Windows
     typedef int foo;
#else
     typedef long foo;
#endif

To:

version (Windows)
     alias foo = int;
else
     alias foo = long;

Preferably I would like to do that without having to run the tool on different platforms and somehow merge the results.

I know it technically violates the meaning of the source code. I'm just trying to automate what a developer would do manually anyway. Doing it manually just increases the risk of making mistakes.

Yes, I agree that decision does not belong in PPCallbacks. But it's
tempting! :slight_smile:

Actually, now that I think about it, Jacob's scenario is the exact
opposite of mine: he seems to be parsing headers in isolation and I
will always see the private header via its umbrella header.

For me the #error will never trigger, but that also means I'll never
get a PPCallback for it. I just want to scan for it and use it to
connect the private header name to its umbrella.

For Jacob it triggers all the time, and he doesn't care about it.
Stripping out the #error before attempting to parse could be a
solution.

Exactly. The big problem is that C uses textual include and D uses symbolic include. I can only translate what's defined in the header file and not included by other header files. I mean, I don't want to translate half of the standard C library for each header file. That means I can't really translate the umbrella header file.

But it would sure be nice to be able to lean on Clang's parser.

Yeah, I agree.

Just do:

$ sed -i -e 's/#error.*//' **/*.h

Really, ignoring the semantics of the program being analyzed (i.e. that it requests compilation to be aborted) is just as much of a hack as using sed to modify the headers. In both cases, you are forcefully trampling the source code's request.

Hi Sean,

The use case for 'softer' errors is source transformation and tooling, where the directives aren't meant to be evaluated in the first place. Without evaluation, there is no request to trample on.

There's certainly a class of syntactic errors that we could recover from more gracefully to enable safe refactoring of incomplete translation units.

What's clear is that sed isn't a viable part of that tooling. The whole aim is to offer a clean semantic interface to analyse and manipulate user code -- and in that context the original source tree is always immutable.

So let's hear out the user requests -- there's something to these but I'm not sure quite what it is yet.

Alp.

>
> I don't think anyone would be against adding a callback to PPCallbacks to
> indicate what the error message is so you can get the data. However,
being
> able to affect the outcome of compilation from a PPCallbacks callback
seems
> unwise;

Yes, I agree that decision does not belong in PPCallbacks. But it's
tempting! :slight_smile:

Actually, now that I think about it, Jacob's scenario is the exact
opposite of mine: he seems to be parsing headers in isolation and I
will always see the private header via its umbrella header.

For me the #error will never trigger, but that also means I'll never
get a PPCallback for it. I just want to scan for it and use it to
connect the private header name to its umbrella.

Just use pp-trace on the lone header and pipe it into a 10-line Python
script. (once a #error callback is introduced in PPCallbacks, and it is
wired into pp-trace).

-- Sean Silva

>
> I don't think anyone would be against adding a callback to PPCallbacks
to
> indicate what the error message is so you can get the data. However,
being
> able to affect the outcome of compilation from a PPCallbacks callback
seems
> unwise;

Yes, I agree that decision does not belong in PPCallbacks. But it's
tempting! :slight_smile:

Actually, now that I think about it, Jacob's scenario is the exact
opposite of mine: he seems to be parsing headers in isolation and I
will always see the private header via its umbrella header.

For me the #error will never trigger, but that also means I'll never
get a PPCallback for it. I just want to scan for it and use it to
connect the private header name to its umbrella.

Just use pp-trace on the lone header and pipe it into a 10-line Python
script. (once a #error callback is introduced in PPCallbacks, and it is
wired into pp-trace).

Do'h, I already said that upthread....

-- Sean Silva

That's one way of looking at it.

But we're not really compiling (maybe Jacob is, in some sense); we're
doing static analysis and the #error directives could give us useful
data for analysis instead of just aborting.

I'm doing source to source translation. I don't know if you consider that
compiling. It doesn't generate any object or exectuable code.

You are trying to understand the meaning of the program, which means that
you are under the same correctness constraints as when compiling. Consider:

#if defined(FOO)
#error "bar"
void
#else
int
#endif
some_function(int arg);

If you ignore the #error, you will misunderstand the program's semantics.

-- Sean Silva

Just do:

$ sed -i -e 's/#error.*//' **/*.h

Really, ignoring the semantics of the program being analyzed (i.e. that
it requests compilation to be aborted) is just as much of a hack as using
sed to modify the headers. In both cases, you are forcefully trampling the
source code's request.

Hi Sean,

The use case for 'softer' errors is source transformation and tooling,
where the directives aren't meant to be evaluated in the first place.
Without evaluation, there is no request to trample on.

There's certainly a class of syntactic errors that we could recover from
more gracefully to enable safe refactoring of incomplete translation units.

I assume that "safe refactoring" means "respects the source code's
meaning". Code that has a syntactic error has no defined meaning and
therefore by definition cannot be refactored in a "safe" manner. Any such
approach will be purely heuristic.

-- Sean Silva

You are trying to understand the meaning of the program, which means
that you are under the same correctness constraints as when compiling.

I want the tool to understand enough to do the translation.

Consider:

#if defined(FOO)
#error "bar"
void
#else
int
#endif
some_function(int arg);

If you ignore the #error, you will misunderstand the program's semantics.

I do understand what you're saying but I don't see any other way. The feature would probably be off by default with a flag to enable it. The user is free to use it if he/she wants to.

You are trying to understand the meaning of the program, which means

that you are under the same correctness constraints as when compiling.

I want the tool to understand enough to do the translation.

Consider:

#if defined(FOO)
#error "bar"
void
#else
int
#endif
some_function(int arg);

If you ignore the #error, you will misunderstand the program's semantics.

I do understand what you're saying but I don't see any other way.

The cleanest thing in your particular usecase is to just ignore that header
if you hit that sort of #error, since you will end up including it through
it's "proper" user-facing header when processing a different header. Or is
there some impediment to doing that?

Otherwise, it's really a discussion about "where do we put in a gross hack
to trample the source code's intent and forcibly ignore the check and
#error". That gross hack could be in clang, it could be done by sed, it
could be done in memory on the file buffers, it could be done by detecting
such errors and reparsing with -D__INCLUDED_UMBRELLA_H (or whatever the
guard macro is), or any of a number of techniques. Without a really
compelling use case, my gut is to not want that hack to live in clang.

-- Sean Silva

The cleanest thing in your particular usecase is to just ignore that
header if you hit that sort of #error, since you will end up including
it through it's "proper" user-facing header when processing a different
header. Or is there some impediment to doing that?

This doesn't really work. The problem is the tool is designed to translate header files one by one. The reason for that is that the C #include directive is so different from how modules work in D. I don't want to end up with one single enormous D module which contains declarations for a complete C library and half of the C standard library. Therefore I'm only translating what's actually declared in the header file being translated.

Otherwise, it's really a discussion about "where do we put in a gross
hack to trample the source code's intent and forcibly ignore the check
and #error". That gross hack could be in clang, it could be done by sed,
it could be done in memory on the file buffers, it could be done by
detecting such errors and reparsing with -D__INCLUDED_UMBRELLA_H (or
whatever the guard macro is), or any of a number of techniques. Without
a really compelling use case, my gut is to not want that hack to live in
clang.

Hmm, using the -D flag is a pretty good idea, as a workaround. That wouldn't require me to change any code.

The cleanest thing in your particular usecase is to just ignore that

header if you hit that sort of #error, since you will end up including
it through it's "proper" user-facing header when processing a different
header. Or is there some impediment to doing that?

This doesn't really work. The problem is the tool is designed to translate
header files one by one. The reason for that is that the C #include
directive is so different from how modules work in D. I don't want to end
up with one single enormous D module which contains declarations for a
complete C library and half of the C standard library. Therefore I'm only
translating what's actually declared in the header file being translated.

I don't see the problem. Just parse the public umbrella header. Clang keeps
accurate source information and can tell you which file each declaration is
in, and can even tell you if a file is a system header or not.

Also, using the private, internal organization of the library's headers to
determine the actual user-facing module structure of the D API seems
*really* unwise....

Otherwise, it's really a discussion about "where do we put in a gross

hack to trample the source code's intent and forcibly ignore the check
and #error". That gross hack could be in clang, it could be done by sed,
it could be done in memory on the file buffers, it could be done by
detecting such errors and reparsing with -D__INCLUDED_UMBRELLA_H (or
whatever the guard macro is), or any of a number of techniques. Without
a really compelling use case, my gut is to not want that hack to live in
clang.

Hmm, using the -D flag is a pretty good idea, as a workaround. That
wouldn't require me to change any code.

Glad the idea was useful to you.

-- Sean Silva