[Patches][RFC] What to do about bitcode streaming.

+llvmdev

Currently we support reading bitcode in 3 ways:

* Read everything upfront.
* Be lazy about reading the function bodies.
* Read the bitcode over a streaming interface.

The first two modes are commonly used and well tested. In fact the
"read everything" mode is implemented by starting lazy and calling
materializeAllPermanently.

The thirds mode is used in tree only by llvm-dis, which means that
some bugs can hide for quite some time (222505 for example). To the
best of my knowledge it is only real user is PNaCl.

It also has a disproportional code complexity. There are two virtual
interfaces (DataStreamer and MemoryObject) and they are leaky: Not all
bitcode features are supported when streaming and there are a few "if
(LazyStreamer)" int the reader.

I have attached two patches that solve the problem with different trade-offs.

One just deletes the feature. This would make PNaCl responsible for
maintaining their own reader. I assume they will have to do it at some
point since they are looked to a fixed format, but this would make it
an immediate issue.

The other one gets most of the simplifications by adding just one
assumption: the size is known. This should be true for http with
Content-Length.

We go from 2 interfaces to 1 and that interface has a single virtual
method. There are no ifs on the data being streamed or missing
features.

It might be even possible to drop the requirement for the size to be
known: Replace the call to AtEndOfStream by just trying to read more
and checking if it failed, but that is a bit more than I wanted to do
for this.

Cheers,
Rafael

I CC'ed llvmdev to put a few more eyes on the "what's the right
direction?" question.

IMO these both look like huge improvements. The streaming interface was
the strangest part of the bitcode reader when I was trying to figure out
how it all fit together for the use-list-order work.

Personally I favour the "just delete it" path, but maybe there's
something I'm missing, and the other patch looks great too.

I didn't read carefully yet (although I noticed two quirks in the first
patch that I've pointed out below) -- I'll have a closer look once
you've decided on a direction.

diff --git a/lib/Bitcode/Reader/BitcodeReader.cpp b/lib/Bitcode/Reader/BitcodeReader.cpp
index e228b1d..1322edd 100644
--- a/lib/Bitcode/Reader/BitcodeReader.cpp
+++ b/lib/Bitcode/Reader/BitcodeReader.cpp
@@ -33,6 +32,26 @@ enum {
   SWITCH_INST_MAGIC = 0x4B5 // May 2012 => 1205 => Hex
};

+BitcodeReader::BitcodeReader(MemoryBuffer * Buffer, LLVMContext & C)

clang-format?

+ : Context(C), TheModule(nullptr), Buffer(Buffer),
+ SeenValueSymbolTable(false), ValueList(C), MDValueList(C),
+ SeenFirstFunctionBody(false), UseRelativeIDs(false),
+ WillMaterializeAllForwardRefs(false) {
diff --git a/tools/llvm-dis/llvm-dis.cpp b/tools/llvm-dis/llvm-dis.cpp
index 072f636..3e21164 100644
--- a/tools/llvm-dis/llvm-dis.cpp
+++ b/tools/llvm-dis/llvm-dis.cpp
@@ -135,12 +205,13 @@ int main(int argc, char **argv) {
     else
       DisplayFilename = InputFilename;
     ErrorOr<std::unique_ptr<Module>> MOrErr =
- getStreamedBitcodeModule(DisplayFilename, Streamer, Context);
+ getStreamedBitcodeModule(DisplayFilename, std::move(Streamer), Context);
     if (std::error_code EC = MOrErr.getError())
       ErrorMessage = EC.message();
     else
       M = std::move(*MOrErr);
     if(M.get()) {
+ errs() << "foobar " << Foo.getMaxPos() << '\n';

Debug output?

I CC'ed llvmdev to put a few more eyes on the "what's the right
direction?" question.

IMO these both look like huge improvements. The streaming interface was
the strangest part of the bitcode reader when I was trying to figure out
how it all fit together for the use-list-order work.

Personally I favour the "just delete it" path, but maybe there's
something I'm missing, and the other patch looks great too.

I didn't read carefully yet (although I noticed two quirks in the first
patch that I've pointed out below) -- I'll have a closer look once
you've decided on a direction.

Thanks. Looks like the clang-format I had installed was a bit too old.
I have also rebased and removed the debug output.

Cheers,
Rafael

remove-data-streamer.patch (38.3 KB)

known-size.patch (39.1 KB)

Hi Rafael,

Would you mind waiting for Derek to come back from vacation to discuss this? We do use this code and could improve how it’s used and tested within LLVM. Derek is the best person to discuss this, he’ll be back in mid-January.

Thanks,

JF

What about the known-size patch? Can you check if that would work for you?

The issue is not so much having it tested. It is having a sane and
easy to understand interface.

Cheers,
Rafael

Hi Rafael,

We will try out your patch and check to see how it will fit.

You also talked about “It might be even possible to drop the requirement for the size to be known: Replace the call to AtEndOfStream by just trying to read more and checking if it failed, but that is a bit more than I wanted to do for this.”

That is to remove some calls to getSize()? Is there any expectation that getPointer() will always a pointer within a contiguous region of memory? That is, getPointer() is non-virtual and always refers to the Data field, but the overriding implementation could still dynamically grow some buffer and change Data to point to a new buffer (could that be “protected” instead?)?

*) We do have a fork of the bitcode reader and bitstream reader. That said, we still use the upstream bitcode reader / bitstream reader in the browser w/ PNaCl. This is used for debugging non-guaranteed-to-be-stable temporary copies of apps in the browser (https://developer.chrome.com/native-client/devguide/devcycle/debugging#debugging-pnacl-pexes-pepper-35-or-later). That said, the overlapped compile/download is probably not a big deal for the debugging use case.

*) It looks like gzip encoded files will have content-length set to the size of the gzipped body, not the size of the original bitcode: http://stackoverflow.com/questions/3819280/content-length-when-using-http-compression. So we’ll need to be careful and treat that as “unknown”, which is a bit unfortunate because gzip helps reduce bandwidth requirements.

*) Otherwise, not all HTTP responses include a content-length, for various reasons, but I don’t know how common this is, but I can try to ask chromium-dev or some other mailing list and see how common that is.

*) The HTTP response can lie, or the server can be buggy, and give the wrong content-length. I think we can try to do the safe thing here, but will have to be careful.

That is to remove some calls to getSize()? Is there any expectation that
getPointer() will always a pointer within a contiguous region of memory?
That is, getPointer() is non-virtual and always refers to the Data field,
but the overriding implementation could still dynamically grow some buffer
and change Data to point to a new buffer (could that be "protected"
instead?)?

I think it would fail. Clang does things like

* readRecord (... &Blob);
* Save a pointer to Blob.
* readRecord (...&Blob)

It should be possible to use a list of buffers and add a
getPointerImpl virtual if absolutely necessary, but see bellow.

*) It looks like gzip encoded files will have content-length set to the size
of the *gzipped* body, not the size of the original bitcode:
content-length when using http compression - Stack Overflow.
So we'll need to be careful and treat that as "unknown", which is a bit
unfortunate because gzip helps reduce bandwidth requirements.

You should try removing the need for getSize, but if that fails, this
is so pnacl specific, can't you wrap the bitcode in with a header
including the size (this would be kept in the PNaCl tree. It would
just be another implementation of the DataStream).

Not quite clear to me that this is the same. Do we have a test of llvm-dis
w/ files that have a bitcode wrapper + misc data surrounding the true
bitcode contents?

We have tests that have a wrapper.

The old dropLeadingBytes would track how many bytes were skipped so that
later, reading address N would mean reading N + BytesSkipped. It also
subtracted the skipped bytes from BytesRead, and I haven't looked closely
enough to see if there is similar state being tracked anymore.

There isn't. The point is precisely that it is not needed.

@@ -218,18 +214,13 @@ public:
   void freeState();

   bool canSkipToPos(size_t pos) const {
- // pos can be skipped to if it is a valid address or one byte past the
end.
- return pos == 0 || BitStream->getBitcodeBytes().isValidAddress(
- static_cast<uint64_t>(pos - 1));
+ size_t Size = BitStream->getSize();
+ return pos < Size;
   }

Is the "pos can be skipped ... or one byte past the end" behavior still
needed, or was it some artifact of how isValidAddress worked?

Doesn't show up on check-all or a lto bootstrap.

I have looked at this a bit more, and now I have a more serious issue
with the current state of trunk. Check the small attached patch. It
just changes llc to use lazy reading of function bodies (not even lazy
streaming). With that patch, llc fails with verifier check failures or
assertions if the verifier is disabled.

So, codegen cannot handle not reading all bodies upfront. How can
PNaCl be getting any improvements from this? I understand that codegen
of a function at a time is something you might want to add a some
point, but lazy streaming was added in Feb 6 22:30:29 2012 and we
still don't have it.

It seems the best thing to do is delete this. It can always come back
if it has a clean implementation and is actually useful (even if only
at -O0).

Cheers,
Rafael

t.patch (484 Bytes)

> That is to remove some calls to getSize()? Is there any expectation that
> getPointer() will always a pointer within a contiguous region of memory?
> That is, getPointer() is non-virtual and always refers to the Data field,
> but the overriding implementation could still dynamically grow some
buffer
> and change Data to point to a new buffer (could that be "protected"
> instead?)?

I think it would fail. Clang does things like

* readRecord (... &Blob);
* Save a pointer to Blob.
* readRecord (...&Blob)

It should be possible to use a list of buffers and add a
getPointerImpl virtual if absolutely necessary, but see bellow.

Ah okay, I forgot to look at what Clang does.

> *) It looks like gzip encoded files will have content-length set to the
size
> of the *gzipped* body, not the size of the original bitcode:
>
content-length when using http compression - Stack Overflow
.
> So we'll need to be careful and treat that as "unknown", which is a bit
> unfortunate because gzip helps reduce bandwidth requirements.

You should try removing the need for getSize, but if that fails, this
is so pnacl specific, can't you wrap the bitcode in with a header
including the size (this would be kept in the PNaCl tree. It would
just be another implementation of the DataStream).

Sure, a wrapper is another option. Will have to discuss that with Derek and
the other PNaCl folks when they get back, though it might just be that
taking streaming out of upstream llvm and keeping it in our own reader is
fine.

> Not quite clear to me that this is the same. Do we have a test of
llvm-dis
> w/ files that have a bitcode wrapper + misc data surrounding the true
> bitcode contents?

We have tests that have a wrapper.

> The old dropLeadingBytes would track how many bytes were skipped so that
> later, reading address N would mean reading N + BytesSkipped. It also
> subtracted the skipped bytes from BytesRead, and I haven't looked closely
> enough to see if there is similar state being tracked anymore.

There isn't. The point is precisely that it is not needed.

Okay great, that's certainly much simpler then =)

> @@ -218,18 +214,13 @@ public:
> void freeState();
>
> bool canSkipToPos(size_t pos) const {
> - // pos can be skipped to if it is a valid address or one byte past
the
> end.
> - return pos == 0 || BitStream->getBitcodeBytes().isValidAddress(
> - static_cast<uint64_t>(pos - 1));
> + size_t Size = BitStream->getSize();
> + return pos < Size;
> }
>
> Is the "pos can be skipped ... or one byte past the end" behavior still
> needed, or was it some artifact of how isValidAddress worked?

Doesn't show up on check-all or a lto bootstrap.

I have looked at this a bit more, and now I have a more serious issue
with the current state of trunk. Check the small attached patch. It
just changes llc to use lazy reading of function bodies (not even lazy
streaming). With that patch, llc fails with verifier check failures or
assertions if the verifier is disabled.

So, codegen cannot handle not reading all bodies upfront. How can
PNaCl be getting any improvements from this? I understand that codegen
of a function at a time is something you might want to add a some
point, but lazy streaming was added in Feb 6 22:30:29 2012 and we
still don't have it.

Here's what I think happens (the PNaCl tree has some additional changes to
make it work).

When you switch to getLazyIRFileModule, file reading is suspended once a
function body block is encountered, so the function bodies are not
materialized yet. Something needs to materialize them. Otherwise, were you
seeing that the verifier couldn't find an entry block?

See the attached patch, which modifies the FPPassManager::runOnModule to
materialize functions before running passes on them, as
FunctionPassManager::run(Function &F) would have done. For PNaCl we had
been using a FunctionPassManager directly, instead of a
PassManager/FPPassManager to get a similar effect (see commented out code
in the attached patch).

In general PNaCl's version of llc has been careful about which ModulePasses
run during CodeGen. If ModulePasses touch the function bodies, then it
could materialize the function bodies too early for streaming to be useful.
So far, we only have a limited set of Module passes that we run outside of
a PassManager. We also don't allow blockaddresses in our subset of the IR,
which would have also caused some forward-referenced functions to be
materialized.

t2.patch (1.77 KB)