[PATCH] Allow per-thread re-direction of outs()/errs()

The attached patch add the ability to programmatically re-direct outs()/errs() to an arbitrary raw_ostream instance, maintaining the raw_ostream instances in a stack structure so clients can push/pop streams at will. The stack is also maintained in thread-local storage, so different threads can re-direct individually. This allows for two use cases:

1. Compilers can attach custom streams to outs()/errs() to intercept output from LLVM without needing to play with STDOUT/STDERR.
2. Compilers can receive LLVM output from different threads individually, instead of having all diagnostics dumped to a single stream.

Unit tests are included in the patch.

0001-Add-the-ability-to-assign-custom-streams-to-outs-err.patch (7.29 KB)

This isn't the right approach. Nothing in the library part of the compiler should be hard coding a stream to write to. What are you trying to accomplish?

-Chris

The attached patch add the ability to programmatically re-direct outs()/errs() to an arbitrary raw_ostream instance, maintaining the raw_ostream instances in a stack structure so clients can push/pop streams at will. The stack is also maintained in thread-local storage, so different threads can re-direct individually. This allows for two use cases:

  1. Compilers can attach custom streams to outs()/errs() to intercept output from LLVM without needing to play with STDOUT/STDERR.
  2. Compilers can receive LLVM output from different threads individually, instead of having all diagnostics dumped to a single stream.

This isn’t the right approach. Nothing in the library part of the compiler should be hard coding a stream to write to. What are you trying to accomplish?

There are a lot of places where warning/debug information is passed directly to errs(). For example, take the Linker class. You can tell it to omit errors/warnings, but it would be preferable to let it emit the errors/warnings to some compiler-controlled stream for message triaging.

For compilers that are (a) multi-threaded, or (b) invoked as part of an embedded library, writing any information from the core libraries directly to stderr is bad. I want to be able to capture this output without messing with the process’ stdout/stderr streams.

This patch essentially prevents LLVM library code from having to hard-code streams for error/warning/diagnostic output. Library code can just use outs()/errs() as they do now, but the compiler process/thread now has the ability to re-direct this output as it sees fit. If no such functionality is needed, the default behavior is to write to stdout/stderr as it does now.

The attached patch add the ability to programmatically re-direct outs()/errs() to an arbitrary raw_ostream instance, maintaining the raw_ostream instances in a stack structure so clients can push/pop streams at will. The stack is also maintained in thread-local storage, so different threads can re-direct individually. This allows for two use cases:

  1. Compilers can attach custom streams to outs()/errs() to intercept output from LLVM without needing to play with STDOUT/STDERR.
  2. Compilers can receive LLVM output from different threads individually, instead of having all diagnostics dumped to a single stream.

This isn’t the right approach. Nothing in the library part of the compiler should be hard coding a stream to write to. What are you trying to accomplish?

There are a lot of places where warning/debug information is passed directly to errs(). For example, take the Linker class. You can tell it to omit errors/warnings, but it would be preferable to let it emit the errors/warnings to some compiler-controlled stream for message triaging.

For compilers that are (a) multi-threaded, or (b) invoked as part of an embedded library, writing any information from the core libraries directly to stderr is bad. I want to be able to capture this output without messing with the process’ stdout/stderr streams.

This patch essentially prevents LLVM library code from having to hard-code streams for error/warning/diagnostic output. Library code can just use outs()/errs() as they do now, but the compiler process/thread now has the ability to re-direct this output as it sees fit. If no such functionality is needed, the default behavior is to write to stdout/stderr as it does now.

Perhaps I’m just getting the wrong impression from the current LLVM code base. Take this snippet from LinkModules.cpp:

if (!SrcM->getDataLayout().empty() && !DstM->getDataLayout().empty() &&
SrcM->getDataLayout() != DstM->getDataLayout())
errs() << “WARNING: Linking two modules of different data layouts!\n”;
if (!SrcM->getTargetTriple().empty() &&
DstM->getTargetTriple() != SrcM->getTargetTriple()) {
errs() << "WARNING: Linking two modules of different target triples: ";
if (!SrcM->getModuleIdentifier().empty())
errs() << SrcM->getModuleIdentifier() << ": ";
errs() << “'” << SrcM->getTargetTriple() << “’ and '”
<< DstM->getTargetTriple() << “'\n”;
}

Would I be correct in assuming that this is actually “bad” code since it’s a library function that writes directly to stderr?

Is the recommended approach to pass a stream to the constructor of any pass or function that wants to emit output outside of debug mode? My concern with this approach is that the compiler would then need to know which passes expect an output stream, and which do not. The supplied patch allows passes to output information without having to worry about where the output is going, it is up to the compiler to specify where this output goes.

This isn’t the right approach. Nothing in the library part of the compiler should be hard coding a stream to write to. What are you trying to accomplish?

There are a lot of places where warning/debug information is passed directly to errs(). For example, take the Linker class. You can tell it to omit errors/warnings, but it would be preferable to let it emit the errors/warnings to some compiler-controlled stream for message triaging.

Yes, this is broken. The fix should be in the linker though, not in errs().

Perhaps I’m just getting the wrong impression from the current LLVM code base. Take this snippet from LinkModules.cpp:

if (!SrcM->getDataLayout().empty() && !DstM->getDataLayout().empty() &&
SrcM->getDataLayout() != DstM->getDataLayout())
errs() << “WARNING: Linking two modules of different data layouts!\n”;
if (!SrcM->getTargetTriple().empty() &&
DstM->getTargetTriple() != SrcM->getTargetTriple()) {
errs() << "WARNING: Linking two modules of different target triples: ";
if (!SrcM->getModuleIdentifier().empty())
errs() << SrcM->getModuleIdentifier() << ": ";
errs() << “'” << SrcM->getTargetTriple() << “’ and '”
<< DstM->getTargetTriple() << “'\n”;
}

Would I be correct in assuming that this is actually “bad” code since it’s a library function that writes directly to stderr?

Yes, this is extremely unfriendly to using the linker as a library. It would be much better for the linker to return its error in a proper way (i.e. extending llvm/Support/system_error.h like llvm/Object/Error.h does).

Is the recommended approach to pass a stream to the constructor of any pass or function that wants to emit output outside of debug mode? My concern with this approach is that the compiler would then need to know which passes expect an output stream, and which do not. The supplied patch allows passes to output information without having to worry about where the output is going, it is up to the compiler to specify where this output goes.

The right fix for this is to fix the code to not report errors textually.

-Chris

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu] On Behalf Of Chris Lattner
Subject: Re: [LLVMdev] [llvm-commits] [PATCH] Allow per-thread re-direction of outs()/errs()

It would be much better for the linker to return its error in a proper way
(i.e. extending llvm/Support/system_error.h like llvm/Object/Error.h does).

The right fix for this is to fix the code to not report errors textually.

Unfortunately, the use of outs() and (especially) errs() is rampant - a simple grep of the 3.1 source tree shows about 1,500 instances. One of the first things we had to implement in order to make LLVM usable is something very similar to what Justin has proposed. Centralizing control of the output in outs()/errs() would seem to be an ideal way to let users of LLVM as a library customize it as they see fit without requiring massive source changes.

- Chuck

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

Virtually all of those are in DEBUG statements or in llvm/tools. Neither are relevant to the discussion.

-Chris

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu] On Behalf Of Chris Lattner
Subject: Re: [LLVMdev] [llvm-commits] [PATCH] Allow per-thread re-direction of outs()/errs()

It would be much better for the linker to return its error in a proper way
(i.e. extending llvm/Support/system_error.h like llvm/Object/Error.h does).

I like the way this works, but it seems like it’s still relatively unused in LLVM except for a few low-level classes. Is there a plan to thread this into the rest of LLVM, like the pass framework? My concern is not so much the passes that are currently using errs() to output the equivalent of “not implemented” messages, but passes that may have legitimate reasons to output warnings/errors, or signify an error condition. If a compiler gets broken IR, it’s reasonable to expect some pass to fail. In such a case, it would be nice to get an error message and error result without having to abort the entire OS process (llvm_unreachable is an example of something I would like to be able to intercept and handle gracefully). Ideally, I would like a build of LLVM that would never result in a call to abort(), only a result status I act on in a higher-level library.

From: llvmdev-bounces@cs.uiuc.edu [mailto:llvmdev-bounces@cs.uiuc.edu] On Behalf Of Chris Lattner
Subject: Re: [LLVMdev] [llvm-commits] [PATCH] Allow per-thread re-direction of outs()/errs()

It would be much better for the linker to return its error in a proper way
(i.e. extending llvm/Support/system_error.h like llvm/Object/Error.h does).

The right fix for this is to fix the code to not report errors textually.

Unfortunately, the use of outs() and (especially) errs() is rampant - a simple grep of the 3.1 source tree shows about 1,500 instances. One of the first things we had to implement in order to make LLVM usable is something very similar to what Justin has proposed. Centralizing control of the output in outs()/errs() would seem to be an ideal way to let users of LLVM as a library customize it as they see fit without requiring massive source changes.

Virtually all of those are in DEBUG statements or in llvm/tools. Neither are relevant to the discussion.

In a way, they are. What if I want a debug trace in a multi-threaded context? Right now, all threads would just spew to the same stream and the result would be unreadable. If you allow threads to setup their own outs() and errs() streams (transparently to the rest of LLVM), you can intercept these as you see fit, perhaps dumping each to a separate file. I acknowledge that you could also enforce single-threaded execution for debug runs, but what if you are in the context of a library with no valid stdout/stderr buffers?

From: Justin Holewinski [mailto:justin.holewinski@gmail.com]
Subject: Re: [LLVMdev] [llvm-commits] [PATCH] Allow per-thread re-direction of outs()/errs()

> Virtually all of those are in DEBUG statements or in llvm/tools.
> Neither are relevant to the discussion.

In a way, they are. What if I want a debug trace in a multi-threaded
context? Right now, all threads would just spew to the same stream
and the result would be unreadable. If you allow threads to setup their
own outs() and errs() streams (transparently to the rest of LLVM), you
can intercept these as you see fit, perhaps dumping each to a separate
file. I acknowledge that you could also enforce single-threaded execution
for debug runs, but what if you are in the context of a library with no
valid stdout/stderr buffers?

I agree wholeheartedly. We're running LLVM in a real-time system with no stdout/stderr, and being able to redirect the debug messages is critical to our testing.

- Chuck

THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

This doesn’t make any sense. There is very little sense in trying to use DEBUG (which gets compiled out of optimized builds) in a multithreaded context. Polluting the purity of outs() and errs() because of this sort of use case doesn’t make any sense to me.

outs() and errs() are useful independent of how LLVM happens to use them. The extensions that you are proposing doesn’t make any sense give nthe design of raw_ostream.

-Chris

Unfortunately, the use of outs() and (especially) errs() is rampant - a simple grep of the 3.1 source tree shows about 1,500 instances. One of the first things we had to implement in order to make LLVM usable is something very similar to what Justin has proposed. Centralizing control of the output in outs()/errs() would seem to be an ideal way to let users of LLVM as a library customize it as they see fit without requiring massive source changes.

Virtually all of those are in DEBUG statements or in llvm/tools. Neither are relevant to the discussion.

In a way, they are. What if I want a debug trace in a multi-threaded context? Right now, all threads would just spew to the same stream and the result would be unreadable. If you allow threads to setup their own outs() and errs() streams (transparently to the rest of LLVM), you can intercept these as you see fit, perhaps dumping each to a separate file. I acknowledge that you could also enforce single-threaded execution for debug runs, but what if you are in the context of a library with no valid stdout/stderr buffers?

This doesn’t make any sense. There is very little sense in trying to use DEBUG (which gets compiled out of optimized builds) in a multithreaded context. Polluting the purity of outs() and errs() because of this sort of use case doesn’t make any sense to me.

outs() and errs() are useful independent of how LLVM happens to use them. The extensions that you are proposing doesn’t make any sense give nthe design of raw_ostream.

Is the problem more that outs() and errs() should always point to valid stdout/stderr streams, regardless of how the hosting compiler is implemented?

If so, how do you feel about introducing a new global stream, logs()? This could replace all uses of outs()/errs() in the LLVM libraries, and default to errs(). Existing command-line tools could just use this default, but compilers that are hosted inside of libraries or other non-command line scenarios can attach a custom stream to logs() that captures all LLVM output, including DEBUG and any legitimate pass output. Command-line tools can continue to use outs()/errs() just as they do now. Nothing would really change in the way LLVM operates by default.

In a way, they are. What if I want a debug trace in a multi-threaded context? Right now, all threads would just spew to the same stream and the result would be unreadable. If you allow threads to setup their own outs() and errs() streams (transparently to the rest of LLVM), you can intercept these as you see fit, perhaps dumping each to a separate file. I acknowledge that you could also enforce single-threaded execution for debug runs, but what if you are in the context of a library with no valid stdout/stderr buffers?

This doesn’t make any sense. There is very little sense in trying to use DEBUG (which gets compiled out of optimized builds) in a multithreaded context. Polluting the purity of outs() and errs() because of this sort of use case doesn’t make any sense to me.

outs() and errs() are useful independent of how LLVM happens to use them. The extensions that you are proposing doesn’t make any sense give nthe design of raw_ostream.

Is the problem more that outs() and errs() should always point to valid stdout/stderr streams, regardless of how the hosting compiler is implemented?

No. outs() and errs() are equivalent to stdout and stderr. Your request to change how they work to be thread local is like saying that we should change how printf works because LLVM is misusing printf somewhere.

If so, how do you feel about introducing a new global stream, logs()? This could replace all uses of outs()/errs() in the LLVM libraries, and default to errs(). Existing command-line tools could just use this default, but compilers that are hosted inside of libraries or other non-command line scenarios can attach a custom stream to logs() that captures all LLVM output, including DEBUG and any legitimate pass output. Command-line tools can continue to use outs()/errs() just as they do now. Nothing would really change in the way LLVM operates by default

In fact, before we had raw_ostream, we had a dbgs() sort of stream, which was intended to be used in debug statements. I’m not strongly opposed to this (it would be much better than violating the sanctity of outs() :), but I’m still unclear why you care so much about DEBUG output. If we had a dbgs() again, it should only be used from within DEBUG macros (for example, the linker shouldn’t use it). Would that be enough to achieve your goal?

-Chris

In a way, they are. What if I want a debug trace in a multi-threaded context? Right now, all threads would just spew to the same stream and the result would be unreadable. If you allow threads to setup their own outs() and errs() streams (transparently to the rest of LLVM), you can intercept these as you see fit, perhaps dumping each to a separate file. I acknowledge that you could also enforce single-threaded execution for debug runs, but what if you are in the context of a library with no valid stdout/stderr buffers?

This doesn’t make any sense. There is very little sense in trying to use DEBUG (which gets compiled out of optimized builds) in a multithreaded context. Polluting the purity of outs() and errs() because of this sort of use case doesn’t make any sense to me.

outs() and errs() are useful independent of how LLVM happens to use them. The extensions that you are proposing doesn’t make any sense give nthe design of raw_ostream.

Is the problem more that outs() and errs() should always point to valid stdout/stderr streams, regardless of how the hosting compiler is implemented?

No. outs() and errs() are equivalent to stdout and stderr. Your request to change how they work to be thread local is like saying that we should change how printf works because LLVM is misusing printf somewhere.

Fair enough. I’ve always been under the impression that outs() and errs() were simply convenience wrappers around some output mechanism, not strictly just stdout/stderr.

If so, how do you feel about introducing a new global stream, logs()? This could replace all uses of outs()/errs() in the LLVM libraries, and default to errs(). Existing command-line tools could just use this default, but compilers that are hosted inside of libraries or other non-command line scenarios can attach a custom stream to logs() that captures all LLVM output, including DEBUG and any legitimate pass output. Command-line tools can continue to use outs()/errs() just as they do now. Nothing would really change in the way LLVM operates by default

In fact, before we had raw_ostream, we had a dbgs() sort of stream, which was intended to be used in debug statements. I’m not strongly opposed to this (it would be much better than violating the sanctity of outs() :), but I’m still unclear why you care so much about DEBUG output. If we had a dbgs() again, it should only be used from within DEBUG macros (for example, the linker shouldn’t use it). Would that be enough to achieve your goal?

It’s not so much DEBUG output that I care specifically about, that’s just an example of a place where the LLVM libraries assume an stderr stream exists and is freely writable. What I really want is a way to capture all LLVM library output that would normally go to stderr, similar to how Clang has its Diagnostic functionality that allows a driver to collect all output messages.

There are legitimate cases where passes may output warnings or even errors that the compiler should be able to direct to the user. errs() is fine for this if you know you’re working with a command-line tool, but that’s not always true.

It sounds like dbgs() is not really the solution here, since I want to capture more than just DEBUG output. If the module linker wants to output a warning, I want to capture that. If an optimization pass wants to emit a performance warning, I want to capture that too. Neither of these cases are DEBUG-only.

I see two possible solutions:

  1. For every pass (or class) that may emit warnings/errors, provide it a raw_ostream instance. This seems to be how some of the utility classes work now, but this puts a burden on the compiler to know which passes actually need a raw_ostream parameter. If you want to add a warning output to any pass, you need to change its interface (create***Pass and constructor) to add the raw_ostream parameter.
  2. Provide a generic mechanism through which passes and any other utility classes can write warning/error information. This gives more control to the compiler without introducing any interface changes.
    Option (1) is do-able, but option (2) seems more general. The specifics of the implementation that I envision are:
  • logs() is added as another stream to raw_ostream.h
  • By default, logs() forwards to errs() [no change in the way LLVM works in the default case]
  • Any pass or utility class that wants to emit warnings/errors should write to logs() instead of errs()
  • DEBUG code should use logs() instead of errs() [again, this will default to errs()]
  • An API is provided that allows a client to re-direct logs() to any arbitrary raw_ostream, on a per-thread level
    You could also envision something along the lines of Clang’s Diagnostics functionality. You could thread a diagnostics interface through the pass manager, and passes could output any messages as diagnostics. These diagnostics would be collected and the compiler could query for them after the pass manager completes.

If I may add my two cents:

I am planning to use LLVM as the backend for a compiler I am working on. And I wholeheartedly agree with Justin that it is a problem, if LLVM is allowed to freely write to stdout and stderr as it is a component which can be used in all sorts of code, be it a GUI IDE, a CLI driver, or whatever.

Also, I have a number of times wondered about the somewhat unusual use of error strings in LLVM (that you pass in a string, which can be assigned a descriptive error message). Better would be, IMHO, to provide an interface along the lines of this:

class ErrorHandler
{
public:
virtual void Report(ErrorKind kind, uint code, const Position &position, const unichar *text, …) = 0;
};

And then let the client, i.e. the frontend, derive from it and handle all the output in whichever way it wants to. The above example is probably too complex for LLVM, but that’s what I use in my compiler. ErrorKind is ErrorKind_Fatal, ErrorKind_Error, etc. Position is simply a filename, a line number, and a character position. unichar is either char or wchar_t, depending on the build mode and target environment.

I hook it up so that I can buffer all errors encountered for later user.

Cheers,
Mikael
– Love Thy Frog!

2012/6/2 Justin Holewinski <justin.holewinski@gmail.com>

If I may add my two cents:

I am planning to use LLVM as the backend for a compiler I am working
on. And I wholeheartedly agree with Justin that it is a problem, if
LLVM is allowed to freely write to stdout and stderr as it is a
component which can be used in all sorts of code, be it a GUI IDE, a
CLI driver, or whatever.

Also, I have a number of times wondered about the somewhat unusual
use of error strings in LLVM (that you pass in a string, which can be
assigned a descriptive error message). Better would be, IMHO, to
provide an interface along the lines of this:

class ErrorHandler
{
public:
        virtual void Report(ErrorKind kind, uint code, const Position
&position, const unichar *text, ...) = 0;
};

And then let the client, i.e. the frontend, derive from it and handle
all the output in whichever way it wants to.

I agree with this.

The above example is
probably too complex for LLVM, but that's what I use in my compiler.
ErrorKind is ErrorKind_Fatal, ErrorKind_Error, etc. Position is
simply a filename, a line number, and a character position.

I don't agree with this. The "position" in the LLVM input will be
fairly meaningless to a user. The frontend should be given as
much information as possible to match the position of the error back to
the originating source position and relevant source variables (if
possible). LLVM has this information available if debugging metadata
was provided. Even if such information was not provided, we should at
least provide the function name (and module ID). The name of the
generating pass should be included, and some mechanism provided to pass
auxiliary metadata-like information (copies of alias sets, loop
dependency info, etc.) that might be useful for the frontend to guide
the user in understanding the problem.

-Hal

If so, how do you feel about introducing a new global stream, logs()? This could replace all uses of outs()/errs() in the LLVM libraries, and default to errs(). Existing command-line tools could just use this default, but compilers that are hosted inside of libraries or other non-command line scenarios can attach a custom stream to logs() that captures all LLVM output, including DEBUG and any legitimate pass output. Command-line tools can continue to use outs()/errs() just as they do now. Nothing would really change in the way LLVM operates by default

In fact, before we had raw_ostream, we had a dbgs() sort of stream, which was intended to be used in debug statements. I’m not strongly opposed to this (it would be much better than violating the sanctity of outs() :), but I’m still unclear why you care so much about DEBUG output. If we had a dbgs() again, it should only be used from within DEBUG macros (for example, the linker shouldn’t use it). Would that be enough to achieve your goal?

It’s not so much DEBUG output that I care specifically about, that’s just an example of a place where the LLVM libraries assume an stderr stream exists and is freely writable.

Ok, then lets ignore DEBUG. It is completely irrelevant to any software that ships with disabled assertions. The reason that this came up was because of your assertion that LLVM writes to outs() and errs() all over the place. This simply isn’t true if you ignore DEBUG.

It sounds like dbgs() is not really the solution here, since I want to capture more than just DEBUG output.

Right.

What I really want is a way to capture all LLVM library output that would normally go to stderr, similar to how Clang has its Diagnostic functionality that allows a driver to collect all output messages.

Fine, this should be done by fixing the few places (like the linker) that are misbehaving.

There are legitimate cases where passes may output warnings or even errors that the compiler should be able to direct to the user. errs() is fine for this if you know you’re working with a command-line tool, but that’s not always true.

If the module linker wants to output a warning, I want to capture that. If an optimization pass wants to emit a performance warning, I want to capture that too. Neither of these cases are DEBUG-only.

Neither of these use cases are a good use for raw_ostream. We have a diagnostics API that the backend uses. A random optimization pass should not use errs() ever, it should use the diagnostics API (see llvm/Support/ErrorHandling.h).

I see two possible solutions:

  1. For every pass (or class) that may emit warnings/errors, provide it a raw_ostream instance. This seems to be how some of the utility classes work now, but this puts a burden on the compiler to know which passes actually need a raw_ostream parameter. If you want to add a warning output to any pass, you need to change its interface (create***Pass and constructor) to add the raw_ostream parameter.
  2. Provide a generic mechanism through which passes and any other utility classes can write warning/error information. This gives more control to the compiler without introducing any interface changes.

We already have #2. It supports (for example) source location information, and is already used by clang and the backend in some places.

Option (1) is do-able, but option (2) seems more general. The specifics of the implementation that I envision are:

  • logs() is added as another stream to raw_ostream.h
  • By default, logs() forwards to errs() [no change in the way LLVM works in the default case]
  • Any pass or utility class that wants to emit warnings/errors should write to logs() instead of errs()
  • DEBUG code should use logs() instead of errs() [again, this will default to errs()]
  • An API is provided that allows a client to re-direct logs() to any arbitrary raw_ostream, on a per-thread level

Please stop focusing on streams. If you actually want to capture useful diagnostic output, starting with text is already a lost cause.

-Chris

If I may add my two cents:

I am planning to use LLVM as the backend for a compiler I am working on. And I wholeheartedly agree with Justin that it is a problem, if LLVM is allowed to freely write to stdout and stderr as it is a component which can be used in all sorts of code, be it a GUI IDE, a CLI driver, or whatever.

LLVM should not be doing this.

Also, I have a number of times wondered about the somewhat unusual use of error strings in LLVM (that you pass in a string, which can be assigned a descriptive error message). Better would be, IMHO, to provide an interface along the lines of this:

class ErrorHandler
{
public:
        virtual void Report(ErrorKind kind, uint code, const Position &position, const unichar *text, ...) = 0;
};

And then let the client, i.e. the frontend, derive from it and handle all the output in whichever way it wants to. The above example is probably too complex for LLVM, but that's what I use in my compiler. ErrorKind is ErrorKind_Fatal, ErrorKind_Error, etc. Position is simply a filename, a line number, and a character position. unichar is either char or wchar_t, depending on the build mode and target environment.

You're right, this would be better. We have even already have it :slight_smile:

  llvm/Support/ErrorHandling.h

-Chris

> If I may add my two cents:
>
> I am planning to use LLVM as the backend for a compiler I am
> working on. And I wholeheartedly agree with Justin that it is a
> problem, if LLVM is allowed to freely write to stdout and stderr as
> it is a component which can be used in all sorts of code, be it a
> GUI IDE, a CLI driver, or whatever.

LLVM should not be doing this.

> Also, I have a number of times wondered about the somewhat unusual
> use of error strings in LLVM (that you pass in a string, which can
> be assigned a descriptive error message). Better would be, IMHO,
> to provide an interface along the lines of this:
>
> class ErrorHandler
> {
> public:
> virtual void Report(ErrorKind kind, uint code, const
> Position &position, const unichar *text, ...) = 0; };
>
> And then let the client, i.e. the frontend, derive from it and
> handle all the output in whichever way it wants to. The above
> example is probably too complex for LLVM, but that's what I use in
> my compiler. ErrorKind is ErrorKind_Fatal, ErrorKind_Error, etc.
> Position is simply a filename, a line number, and a character
> position. unichar is either char or wchar_t, depending on the
> build mode and target environment.

You're right, this would be better. We have even already have it :slight_smile:

  llvm/Support/ErrorHandling.h

This seems to only handle fatal errors. If that's correct, it will
probably need to be extended to handle non-fatal errors, warnings,
suggestions, notes, etc.

-Hal

If I may add my two cents:

I am planning to use LLVM as the backend for a compiler I am
working on. And I wholeheartedly agree with Justin that it is a
problem, if LLVM is allowed to freely write to stdout and stderr as
it is a component which can be used in all sorts of code, be it a
GUI IDE, a CLI driver, or whatever.

LLVM should not be doing this.

Also, I have a number of times wondered about the somewhat unusual
use of error strings in LLVM (that you pass in a string, which can
be assigned a descriptive error message). Better would be, IMHO,
to provide an interface along the lines of this:

class ErrorHandler
{
public:
virtual void Report(ErrorKind kind, uint code, const
Position &position, const unichar *text, …) = 0; };

And then let the client, i.e. the frontend, derive from it and
handle all the output in whichever way it wants to. The above
example is probably too complex for LLVM, but that’s what I use in
my compiler. ErrorKind is ErrorKind_Fatal, ErrorKind_Error, etc.
Position is simply a filename, a line number, and a character
position. unichar is either char or wchar_t, depending on the
build mode and target environment.

You’re right, this would be better. We have even already have it :slight_smile:

llvm/Support/ErrorHandling.h

This seems to only handle fatal errors. If that’s correct, it will
probably need to be extended to handle non-fatal errors, warnings,
suggestions, notes, etc.

Okay, it looks like the combination of llvm/Support/ErrorHandling.h and LLVMContext::emitError solves a part of my use case. I found the CrashRecoveryContext class, which seems to allow me to intercept the report_fatal_error() calls by installing a custom handler that writes to a custom stream and then calls abort() instead of the default exit(). Alternatively, it looks like I can just spawn a thread for the LLVM work and call pthread_exit() instead of abort() in the error handler, which does not require messing with the process signal handlers. Is there any reason to prefer one over the other (thread vs. signal)?

This leaves the following issues:

  1. Why is the error handling stuff in LLVMContext tied to inline asm? The error handler setter is even called setInlineAsmDiagnosticHandler. Can we make this more generic? I glossed right over the emitError methods the last time I read through this stuff because I saw “InlineAsm” everywhere and figured it wasn’t the right tool for the job! :slight_smile:
  2. As Hal points out, this only covers errors at the moment. Is there any resistance to changing emitError to emitDiagnostic and having a SourceMgr::DiagKind argument? This should cover the case where a pass wants to emit a warning or info message that is not part to DEBUG.

With LLVMContext::emitDiagnostic, this should cover my use case.