Need help with code generation

But you’re surely not suggesting that lld will segfault as an error handling mode in production?

You say this was decided in a thread recently- could you please point me at that? I find this really hard to believe.

Cheers,

James

But you're surely not suggesting that lld will segfault as an error
handling mode in production?

The document clearly states that (a) it is user's responsibility to give
sane object files, and (b) a corrupted file may cause a fatal error or SEGV.

You say this was decided in a thread recently- could you please point me
at that? I find this really hard to believe.

Please find it yourself, that was a long thread. The current decision was
not made lightly, so please respect that and take your time to understand
the situation.

Cheers,

You say this was decided in a thread recently- could you please point me
at that? I find this really hard to believe.

Please find it yourself, that was a long thread.

Not helpful. The thread appears to be available here:
http://comments.gmane.org/gmane.comp.compilers.llvm.devel/92955.

The current decision was
not made lightly, so please respect that and take your time to understand
the situation.

It was, however, made extremely controversially.

Tim.

Thanks Tim,

the reason I asked was I was unable to find the right search runes.

That is I guess we needed a decision. And the decision was made for not
sparking the same discussion again so soon.

Hi Rui,

Having read that thread, I’m confused. The thread is entirely about using LLD as a library; nowhere does it mention that segfaulting on invalid input is allowed, required or expected. It’s all about ‘is exit (1) sufficient?’

Can you confirm that you’re happy for LLD to crash when invoked on the command line?

James

On corrupted input? Yes, I am.

Cheers,
Rafael

Of course I’m not happy. And I hope that you understood that that is unusual. Having said that, I’d say it’s however reasonable.

Rafael,

How can a high quality product crash by design? I understand the lack of structured error handling, and I understand asserting (which in release mode would be silent) on internal errors. But on an input? How can an application be taken seriously when crashes are design features?

And I certainly didn’t see consensus or in fact the suggestion of this in the other thread, unless I glazed over an important part.

James

It can crash because .o files are not user input. They are generated.
To get one you need a broken assembler or a broken hardware.

Sorry if lld is not the linker you want, but that is the one we are writing.

As for how it will be taken seriously, well, we seem to be on good
track to be able to link freebsd and to do so faster than gold.

Cheers,
Rafael

My understanding is that clang and llvm themselves are designed this way (crash when the unexpected happens). For example the fact that clang forks itself to be able to report diagnostics is a good indication that this is assumed, and llvm is full of report_fatal_error() (or worse, assertions that can fire on unexpected user input).
I complained on the list at some point that “by design” LLVM as a library requires you to fork and run in a separate process, but the tradeoff in the ease of implementation seems to be the current consensus.

Rafael,

How can a high quality product crash by design? I understand the lack of
structured error handling, and I understand asserting (which in release
mode would be silent) on internal errors. But on an input? How can an
application be taken seriously when crashes are design features?

And I certainly didn't see consensus or in fact the suggestion of this in
the other thread, unless I glazed over an important part.

My understanding is that clang and llvm themselves are designed this way
(crash when the unexpected happens). For example the fact that clang forks
itself to be able to report diagnostics is a good indication that this is
assumed, and llvm is full of report_fatal_error() (or worse, assertions
that can fire on unexpected user input).

So far as I know, any place where LLVM asserts on user input is a bug -
maybe not a high priority bug in some cases, maybe a difficult bug in some
cases, but a bug.

report_fatal_error is a bit of a wart, to be sure (but that's more along
the lines of the previous LLD design thread - API level error handling V
exit(1) from deep in the library*)

* I didn't understand the previous LLD therad to include the possibility of
crash/assert by design, but exit(1) by design V return error codes up the
stack - the latter is what I thought that thread was about.

I complained on the list at some point that "by design" LLVM as a library
requires you to fork and run in a separate process, but the tradeoff in the
ease of implementation seems to be the current consensus.

The forking in a separate process is, so far as I understand, simply a
necessary defensive measure for certain products - we accept that LLVM has
bugs that may lead it to crash, so we want crash reports when that happens.
The fact that we report the crashes and fix them seems to indicate that
they're not by design. If the fork was simply to swallow (rather than
report) the crashes, then it would seem to indicate crash-by-design, but it
isn't.

- David

2016/03/21 22:23 “Rafael Espíndola” <rafael.espindola@gmail.com>:

Rafael,

How can a high quality product crash by design? I understand the lack of
structured error handling, and I understand asserting (which in release mode
would be silent) on internal errors. But on an input? How can an application
be taken seriously when crashes are design features?

And I certainly didn’t see consensus or in fact the suggestion of this in
the other thread, unless I glazed over an important part.

It can crash because .o files are not user input. They are generated.
To get one you need a broken assembler or a broken hardware.

Sorry if lld is not the linker you want, but that is the one we are writing.

As for how it will be taken seriously, well, we seem to be on good
track to be able to link freebsd and to do so faster than gold.

I think this point is worth emphasizing. LLD is actually much faster than gold.

My understanding is that clang and llvm themselves are designed this way
(crash when the unexpected happens).

I don't think so. I'd view any Clang crash as a bug (probably to be
prioritised below silent CodeGen and many others, but not "working as
designed").

For example the fact that clang forks itself to be able to report diagnostics

That seems like just trying to make our own job easier to me. I think
the entire point of the fork is to get a backtrace we can fix, and
point out where the user should send it.

llvm is full of report_fatal_error() (or worse, assertions that can fire on unexpected user input).

A bit of a grey area since LLVM isn't itself a user-facing tool, but I
think I'd still say that a report_fatal_error that's not actionable by
the user is actually an LLVM bug. And a segfault definitely so.

Cheers.

Tim.

I think this point is worth emphasizing. LLD is actually much faster than
gold.

So's "exit(1)".

Tim.

I was writing a response but David and Tim got there first more eloquently. +1 to both of them.

I also find your tone worryingly totalitarian, Rafael.

It is completely trivial to crash llvm. A case I wrote today in
another thread while waiting for tests to run:

target triple = "x86_64-unknown-linux-gnu"
@".data" = global i32 42

That will crash "llc -filetype=obj". The fact that it is considered a
bug doesn't mean much if there is no coordinated effort to fix them.

Right now lld is already harder to crash than llvm. We are just being
honest about the fact that it is possible to craft a .o file that will
crash it.

Cheers,
Rafael

It seems that it is repeating the same discussion again, unfortunately. I believe that everybody can at least accept either is reasonable choice. Also, I’d like to mention that LLD developers who are actually hacking the thing everyday are thinking that that is a reasonable choice as far as I can tell. So why don’t we go with the decision I wrote in the doc?

>> My understanding is that clang and llvm themselves are designed this way
>> (crash when the unexpected happens).
>
> I don't think so. I'd view any Clang crash as a bug (probably to be
> prioritised below silent CodeGen and many others, but not "working as
> designed").
>
>> For example the fact that clang forks itself to be able to report
diagnostics
>
> That seems like just trying to make our own job easier to me. I think
> the entire point of the fork is to get a backtrace we can fix, and
> point out where the user should send it.
>
>> llvm is full of report_fatal_error() (or worse, assertions that can
fire on unexpected user input).
>
> A bit of a grey area since LLVM isn't itself a user-facing tool, but I
> think I'd still say that a report_fatal_error that's not actionable by
> the user is actually an LLVM bug. And a segfault definitely so.

It is completely trivial to crash llvm. A case I wrote today in
another thread while waiting for tests to run:

target triple = "x86_64-unknown-linux-gnu"
@".data" = global i32 42

That will crash "llc -filetype=obj". The fact that it is considered a
bug doesn't mean much if there is no coordinated effort to fix them.

I think it does, actually - that patches will be accepted to fix pretty
much any crash in LLVM. (llc isn't a user facing tool, so that's a
praticularly low priority - but as a general library (I assume your example
also crashes Clang, which would be where this would surface in a more
important way) it's pretty well accepted that crashes are bugs, I think)

Right now lld is already harder to crash than llvm. We are just being
honest about the fact that it is possible to craft a .o file that will
crash it.

But the difference seems to be you know about these cases and don't
consider them to be bugs/anything to fix. In LLVM if they're known, they're
at least considered bugs and often/usually considered by someone to be
worth fixing at some point.

- Dave

It seems that it is repeating the same discussion again, unfortunately.

For myself, I find this thread to contain a new discussion (is crashing by
design) from the previous (is exit(1)/non-API-usable-behavior for
obscure/erroneous inputs by design) discussion. Perhaps I misunderstood or
didn't catch the part where crashing/assert/UB was included in the previous
discussion.

I believe that everybody can at least accept either is reasonable choice.

I'm less inclined to accept that UB/crash/assert failure is a reasonable
choice for a release product (for a dev tool like bugpoint, llc, etc, I
sort of accept it) even for obscure inputs.