reading untrusted bitcode

Suppose I had a program that would receive bitcode and do something with it other than run it, and for performance I this program is built Release-Asserts.

Currently, illegal types are caught by the AsmParser and again by some assertions in Type.cpp. However, a .bc file could contain an illegal type like [5 x void] and without assertions enabled, this actually gets created in the IR. Also, the verifier doesn't check for these, assuming that illegal types must have come in through the C++ API.

Whose responsibility is it supposed to be to check types for legality? The BCReader? Or perhaps the verifier?

Nick

It's pretty easy to resolve using the rule "assertions should never
trigger": if the bitcode reader triggers an assertion, it's a bug in
the bitcode reader.

-Eli

Eli Friedman wrote:

Whose responsibility is it supposed to be to check types for legality?
The BCReader? Or perhaps the verifier?

It's pretty easy to resolve using the rule "assertions should never
trigger": if the bitcode reader triggers an assertion, it's a bug in
the bitcode reader.

And now, arguments for the other side:

   1. Reading untrusted bytecode is a theoretical use case (I don't know of anyone who does it and I'm not planning to either) and bytecode reader performance is critical for your LTO times (which is a real use case and one I do care about).
   2. If you put it in the verifier instead you can detect errors from direct C++ API users as well.

Do you want to keep your position or switch? :wink:

Nick

Eli Friedman wrote:

Whose responsibility is it supposed to be to check types for legality?
The BCReader? Or perhaps the verifier?

It's pretty easy to resolve using the rule "assertions should never
trigger": if the bitcode reader triggers an assertion, it's a bug in
the bitcode reader.

I asked Chris about this on IRC and he states that he doesn't want this in the bitcode reader. If someone wants to verify their types then it belongs in the Verifier.

The assertions should probably stay as they are, I don't see any benefit to removing them. It turns out that you can construct such illegal types even with the assertions we have. Consider:

-- a.ll --
%ty = type opaque
%foo = type <4 x %ty>

-- b.ll --
%ty = type label

Upon llvm-link'ing those, you end up with <4 x label> without triggering any asserts. Disassembling the resulting .bc file will trigger an assert though.

Nick