[PATCH / PROPOSAL] bitcode encoding that is ~15% smaller for large bitcode files...

Hi all,

I’ve been looking into how to make llvm bitcode files smaller. There is one simple change that appears to shrink linked bitcode files by about 15%. See this spreadsheet for some rough data:

https://docs.google.com/spreadsheet/ccc?key=0AjRrJHQc4_bddEtJdjdIek5fMDdIdFFIZldZXzdWa0E

The change is in how operand ids are encoded in bitcode files. Rather than use an “absolute number” given by the ValueEnumerator, which counts up within a function, we can have the id be relative to the current instruction.

I.e., Instead of having:

… = icmp eq i32 n-1, n-2
br i1 …, label %bb1, label %bb2

you have

… = icmp eq i32 1, 2
br i1 1, label %bb1, label %bb2

Thus the ids remain relatively small and can be encoded in fewer bits. This counters how ValueEnumerator starts assigning ids within functions at some large N (where N is the number of module-level values?). This also makes it more likely to have a repeated sequences of bytes. Many instructions in a function may now have similar operands.

The format of the “.ll” files from llvm-dis is not changed, just the binary format (run llvm-bcanalyzer -dump to see the difference).

Caveats:

  • Forward references will create negative-valued ids (which end up being written out as large 32-bit integers, as far as I could tell). The common case for this is PHI nodes, but in larger tests fewer bits overall are used for INST_PHI.

  • Doesn’t help with constant operands. Their ids will now constantly change…

  • To retain backward compatibility with old bitcode files, I ended up using up a new bitc value “bitc::FUNCTION_BLOCK_REL_ID” vs the existing “bitc::FUNCTION_BLOCK_ID”.

Are there known problems with this scheme?
Are there other ideas that have been floated around for reducing the size of bitcode files?
In any case, the patch is attached if there is interest… If you want to try out the patch, you can toggle between the new and old encoding by using the flag “-enable-old-style-functions” vs no flags, with llvm-as.

  • Jan

relative_ids.patch (37.7 KB)

Hi Jan,

I've been looking into how to make llvm bitcode files smaller. There is one
simple change that appears to shrink linked bitcode files by about 15%. See
this spreadsheet for some rough data:

https://docs.google.com/spreadsheet/ccc?key=0AjRrJHQc4_bddEtJdjdIek5fMDdIdFFIZldZXzdWa0E

the improvement is wonderful!

...

In any case, the patch is attached if there is interest... If you want to try
out the patch, you can toggle between the new and old encoding by using the flag
"-enable-old-style-functions" vs no flags, with llvm-as.

I don't know anything much about the bitcode representation, but anyway:

diff --git a/include/llvm/Bitcode/BitstreamReader.h b/include/llvm/Bitcode/BitstreamReader.h
index 3daef78..840f57e 100644
--- a/include/llvm/Bitcode/BitstreamReader.h
+++ b/include/llvm/Bitcode/BitstreamReader.h
@@ -409,7 +409,7 @@ public:
   }

   /// EnterSubBlock - Having read the ENTER_SUBBLOCK abbrevid, enter
- /// the block, and return true if the block is valid.
+ /// the block, and return true if the block has an error.
   bool EnterSubBlock(unsigned BlockID, unsigned *NumWordsP = 0) {
     // Save the current block's state on BlockScope.
     BlockScope.push_back(Block(CurCodeSize));
diff --git a/include/llvm/Bitcode/LLVMBitCodes.h b/include/llvm/Bitcode/LLVMBitCodes.h
index c1dc190..6c81739 100644
--- a/include/llvm/Bitcode/LLVMBitCodes.h
+++ b/include/llvm/Bitcode/LLVMBitCodes.h
@@ -33,9 +33,9 @@ namespace bitc {
     UNUSED_ID1,

     CONSTANTS_BLOCK_ID,
- FUNCTION_BLOCK_ID,
+ FUNCTION_BLOCK_ID, // Deprecated in favor of FUNCTION_BLOCK_ID_REL

FUNCTION_BLOCK_ID_REL -> FUNCTION_BLOCK_REL_ID

Is there any point in having this on a per-function basis, why not have it be
per-module?

- UNUSED_ID2,
+ FUNCTION_BLOCK_REL_ID,

     VALUE_SYMTAB_BLOCK_ID,
     METADATA_BLOCK_ID,
@@ -257,8 +257,9 @@ namespace bitc {
     SYNCHSCOPE_CROSSTHREAD = 1
   };

- // The function body block (FUNCTION_BLOCK_ID) describes function bodies. It
- // can contain a constant block (CONSTANTS_BLOCK_ID).
+ // The function body block (FUNCTION_BLOCK_ID and FUNCTION_BLOCK_REL_ID)
+ // describes function bodies. It can contain a constant block
+ // (CONSTANTS_BLOCK_ID).
   enum FunctionCodes {
     FUNC_CODE_DECLAREBLOCKS = 1, // DECLAREBLOCKS: [n]

diff --git a/lib/Bitcode/Reader/BitcodeReader.cpp b/lib/Bitcode/Reader/BitcodeReader.cpp
index b69a362..5fd3dfd 100644
--- a/lib/Bitcode/Reader/BitcodeReader.cpp
+++ b/lib/Bitcode/Reader/BitcodeReader.cpp

...

@@ -1495,6 +1497,12 @@ bool BitcodeReader::ParseModule(bool Resume) {
           SeenFirstFunctionBody = true;
         }

+ // Remember the encoding style for the function.
+ if (FunctionsWithBodies.empty())
+ return Error("Insufficient function protos");

Why this new error? Or would it have occurred anyway, later? Maybe it
should be an assertion, if it should be impossible. I think "prototypes"
would be better than "protos".

+ FuncUseRelativeIDs[FunctionsWithBodies.back()] =
+ (SubBlockID == bitc::FUNCTION_BLOCK_REL_ID);

If the use of relative ids was on a per module basis then this wouldn't be
needed.

+
         if (RememberAndSkipFunctionBody())
           return true;
         // For streaming bitcode, suspend parsing when we reach the function
@@ -1901,8 +1909,15 @@ bool BitcodeReader::ParseMetadataAttachment() {

/// ParseFunctionBody - Lazily parse the specified function body block.
bool BitcodeReader::ParseFunctionBody(Function *F) {
- if (Stream.EnterSubBlock(bitc::FUNCTION_BLOCK_ID))
- return Error("Malformed block record");
+ if (FuncUseRelativeIDs[F]) {
+ if (Stream.EnterSubBlock(bitc::FUNCTION_BLOCK_REL_ID))
+ return Error("Malformed block record");
+ UseRelativeIDs = true;
+ } else {
+ if (Stream.EnterSubBlock(bitc::FUNCTION_BLOCK_ID))
+ return Error("Malformed block record");
+ UseRelativeIDs = false;
+ }

Likewise.

   InstructionList.clear();
   unsigned ModuleValueListSize = ValueList.size();

...

@@ -2260,8 +2275,10 @@ bool BitcodeReader::ParseFunctionBody(Function *F) {
       }
       else {
         BasicBlock *FalseDest = getBasicBlock(Record[1]);
- Value *Cond = getFnValueByID(Record[2], Type::getInt1Ty(Context));
- if (FalseDest == 0 || Cond == 0)
+ Value *Cond;
+ if (getValueConst(Record, 2,
+ NextValueNo, Type::getInt1Ty(Context), Cond) ||
+ FalseDest == 0 || Cond == 0)

This seems rather ugly and complicated... Can't your new method just return
"Cond", and return null in case of an error? Not sure the name is the best
either, it doesn't suggest at all to me that it doesn't increment the slot
number, it sounds like it's for handling constants.

           return Error("Invalid BR record");
         I = BranchInst::Create(TrueDest, FalseDest, Cond);
         InstructionList.push_back(I);
@@ -2276,9 +2293,10 @@ bool BitcodeReader::ParseFunctionBody(Function *F) {
         Type *OpTy = getTypeByID(Record[1]);
         unsigned ValueBitWidth = cast<IntegerType>(OpTy)->getBitWidth();

- Value *Cond = getFnValueByID(Record[2], OpTy);
+ Value *Cond;
         BasicBlock *Default = getBasicBlock(Record[3]);
- if (OpTy == 0 || Cond == 0 || Default == 0)
+ if (getValueConst(Record, 2, NextValueNo, OpTy, Cond) ||
+ OpTy == 0 || Cond == 0 || Default == 0)
           return Error("Invalid SWITCH record");

Likewise (more instances follow).

...

@@ -2444,7 +2467,8 @@ bool BitcodeReader::ParseFunctionBody(Function *F) {
       InstructionList.push_back(PN);

       for (unsigned i = 0, e = Record.size()-1; i != e; i += 2) {
- Value *V = getFnValueByID(Record[1+i], Ty);
+ Value *V = 0;
+ getValueConst(Record, 1+i, NextValueNo, Ty, V);

You didn't check the return value.

         BasicBlock *BB = getBasicBlock(Record[2+i]);
         if (!V || !BB) return Error("Invalid PHI record");
         PN->addIncoming(V, BB);

...

diff --git a/lib/Bitcode/Reader/BitcodeReader.h b/lib/Bitcode/Reader/BitcodeReader.h
index e7c4e94..249d723 100644
--- a/lib/Bitcode/Reader/BitcodeReader.h
+++ b/lib/Bitcode/Reader/BitcodeReader.h

...

@@ -260,14 +274,30 @@ private:
     ResVal = getFnValueByID(ValNo, getTypeByID(TypeNo));
     return ResVal == 0;
   }
+ /// getValue - Read a value out of the specified record from slot 'Slot'.
+ /// Increment Slot past the number of slots used in the record.
+ /// Return true on failure.
   bool getValue(SmallVector<uint64_t, 64> &Record, unsigned &Slot,
- Type *Ty, Value *&ResVal) {
+ unsigned InstNum, Type *Ty, Value *&ResVal) {
     if (Slot == Record.size()) return true;
     unsigned ValNo = (unsigned)Record[Slot++];
+ // Adjust the ValNo, if it was encoded relative to the InstNum.
+ if (UseRelativeIDs)
+ ValNo = InstNum - ValNo;
+ ResVal = getFnValueByID(ValNo, Ty);
+ return ResVal == 0;
+ }
+ /// getValueConst -- Like getValue, but does not increment the Slot number.
+ bool getValueConst(SmallVector<uint64_t, 64> &Record, unsigned Slot,
+ unsigned InstNum, Type *Ty, Value *&ResVal) {
+ if (Slot == Record.size()) return true;
+ unsigned ValNo = (unsigned)Record[Slot];
+ // Adjust the ValNo, if it was encoded relative to the InstNum.
+ if (UseRelativeIDs)
+ ValNo = InstNum - ValNo;
     ResVal = getFnValueByID(ValNo, Ty);
     return ResVal == 0;
   }

How about having just one method, eg getValue, and simply decrement the slot
number by hand in those places that need it. If you really want two, implement
getValue in terms of getValueConst, by calling it then incrementing the slot
number for example.

-

   bool ParseModule(bool Resume);
   bool ParseAttributeBlock();
diff --git a/lib/Bitcode/Writer/BitcodeWriter.cpp b/lib/Bitcode/Writer/BitcodeWriter.cpp
index b3f1bb1..e3ebe38 100644
--- a/lib/Bitcode/Writer/BitcodeWriter.cpp
+++ b/lib/Bitcode/Writer/BitcodeWriter.cpp

...

@@ -55,7 +62,7 @@ enum {
   CONSTANTS_CE_CAST_Abbrev,
   CONSTANTS_NULL_Abbrev,

- // FUNCTION_BLOCK abbrev id's.
+ // FUNCTION_BLOCK_REL abbrev id's.

Should this comment have been changed?

   FUNCTION_INST_LOAD_ABBREV = bitc::FIRST_APPLICATION_ABBREV,
   FUNCTION_INST_BINOP_ABBREV,
   FUNCTION_INST_BINOP_FLAGS_ABBREV,

...

@@ -1025,12 +1037,15 @@ static void WriteModuleConstants(const ValueEnumerator &VE,
///
/// This function adds V's value ID to Vals. If the value ID is higher than the
/// instruction ID, then it is a forward reference, and it also includes the
-/// type ID.
+/// type ID. The value ID that is writtne is encoded as relative to the InstID.

writtne -> written
is encoded as relative to -> is encoded relative to

static bool PushValueAndType(const Value *V, unsigned InstID,
                              SmallVector<unsigned, 64> &Vals,
                              ValueEnumerator &VE) {
   unsigned ValID = VE.getValueID(V);
- Vals.push_back(ValID);
+ if (EnableOldStyleFunctions)
+ Vals.push_back(ValID);
+ else
+ Vals.push_back(InstID - ValID);
   if (ValID >= InstID) {
     Vals.push_back(VE.getTypeID(V->getType()));
     return true;

...

@@ -1164,7 +1191,12 @@ static void WriteInstruction(const Instruction &I, unsigned InstID,
       Vals64.push_back(SwitchRecordHeader);

       Vals64.push_back(VE.getTypeID(SI.getCondition()->getType()));
- Vals64.push_back(VE.getValueID(SI.getCondition()));
+ // ValueID for non-BB / literal operands are relative to the InstID.
+ // (e.g., the condition value).
+ if (EnableOldStyleFunctions)
+ Vals64.push_back(VE.getValueID(SI.getCondition()));
+ else
+ Vals64.push_back(InstID - VE.getValueID(SI.getCondition()));

Shouldn't you use a "push" method like PushValue for this too?

       Vals64.push_back(VE.getValueID(SI.getDefaultDest()));
       Vals64.push_back(SI.getNumCases());
       for (SwitchInst::CaseIt i = SI.case_begin(), e = SI.case_end();

...

@@ -1358,8 +1392,13 @@ static void WriteInstruction(const Instruction &I, unsigned InstID,
     PushValueAndType(CI.getCalledValue(), InstID, Vals, VE); // Callee

     // Emit value #'s for the fixed parameters.
- for (unsigned i = 0, e = FTy->getNumParams(); i != e; ++i)
- Vals.push_back(VE.getValueID(CI.getArgOperand(i))); // fixed param.
+ for (unsigned i = 0, e = FTy->getNumParams(); i != e; ++i) {
+ if (FTy->getParamType(i)->isLabelTy()) {

Is this actually possible?

+ Vals.push_back(VE.getValueID(CI.getArgOperand(i)));
+ } else {
+ PushValue(CI.getArgOperand(i), InstID, Vals, VE); // fixed param.
+ }
+ }

     // Emit type/value pairs for varargs params.
     if (FTy->isVarArg()) {

...

@@ -1514,8 +1553,8 @@ static void WriteFunction(const Function &F, ValueEnumerator &VE,
// Emit blockinfo, which defines the standard abbreviations etc.
static void WriteBlockInfo(const ValueEnumerator &VE, BitstreamWriter &Stream) {
   // We only want to emit block info records for blocks that have multiple
- // instances: CONSTANTS_BLOCK, FUNCTION_BLOCK and VALUE_SYMTAB_BLOCK. Other
- // blocks can defined their abbrevs inline.
+ // instances: CONSTANTS_BLOCK, FUNCTION_BLOCK_REL and VALUE_SYMTAB_BLOCK.
+ // Other blocks can defined their abbrevs inline.

Should this comment really have changed?

   Stream.EnterBlockInfoBlock(2);

   { // 8-bit fixed-width VST_ENTRY/VST_BBENTRY strings.
@@ -1603,68 +1642,68 @@ static void WriteBlockInfo(const ValueEnumerator &VE, BitstreamWriter &Stream) {

   // FIXME: This should only use space for first class types!

- { // INST_LOAD abbrev for FUNCTION_BLOCK.
+ { // INST_LOAD abbrev for FUNCTION_BLOCK_REL.

Likewise (occurs many more times).

Ciao, Duncan.

Whenever anyone proposes a custom compression scheme for a data format, the first question that should always be asked is how does it compare to using a generic off-the-shelf compression algorithm.

For comparison, if we just use the DEFLATE algorithm (via gzip) on bitcode files, we seem to see a saving of about 35% in a quick-and-dirty test. If you apply your compression scheme (effectively a form of LZW, but only applied to a subset of the values), does this reduce the efficiency of a general purpose solution? Does it make more sense than just applying DEFLATE to the bitcode when it's written to the disk? The other advantage of separating the compression from the encoding, of course, is that it's easier to parallelise, as a fairly coarse-grained dataflow model can be used when streaming to and from compressed bitcode.

David

Can you use an SLEB-like representation?

It's probably not going to be backwards compatible, but if there isn't
an SLEB/ULEB representation in bitcode, I'd say it's a good
improvement. :wink:

If you look at his spreadsheet [1] in the OP, he did test this case.
He even has a column showing the improvement difference uncompressed
vs. compressed. Even after compression with gzip, there is still a
noticeable difference.

--Sean Silva

[1] https://docs.google.com/spreadsheet/ccc?key=0AjRrJHQc4_bddEtJdjdIek5fMDdIdFFIZldZXzdWa0E

Thanks for the review Duncan!

Yes, on second thought, it would be simpler (less bookkeeping)
if this was toggled at the module level (but I haven’t looked into
how that would be done). I had done it at the function level
to begin with because it affected only the function bodies,
so it would be more obvious what changed. Anyone have a
stronger feeling (or a reason!) why it would be better to do this at
a per-function vs per-module basis?

I’ll look into how to do it at the module level and send out a new
patch later + with your comments addressed.

> - Forward references will create negative-valued ids (which end up being
> written out as large 32-bit integers, as far as I could tell).

Can you use an SLEB-like representation?

It's probably not going to be backwards compatible, but if there isn't
an SLEB/ULEB representation in bitcode, I'd say it's a good
improvement. :wink:

I think most of the instructions operands are encoded as VBR(N)
for some N number of bits, and it seems to me that VBR(8) is like ULEB?
I think that the default N is 6 bits, but I haven't tried tweaking these
parameters.

It has a similar purpose, but slightly different. SLEB (the signed
version) wouldn't use the whole word for -1, since the sign expansion
happens after the number is encoded to 7 bits package, so any negative
number that can be represented using 7 bits would still be stored in a
byte.

> I think most of the instructions operands are encoded as VBR(N)
> for some N number of bits, and it seems to me that VBR(8) is like ULEB?
> I think that the default N is 6 bits, but I haven't tried tweaking these
> parameters.

It has a similar purpose, but slightly different. SLEB (the signed
version) wouldn't use the whole word for -1, since the sign expansion
happens after the number is encoded to 7 bits package, so any negative
number that can be represented using 7 bits would still be stored in a
byte.

Ah I see. It looks like LLVM's bitcode writer does have a signed version
of the VBRs, which is used for integer constants. I tried using that for
phis and added the data to the *second sheet* in the original
spreadsheet. See the tab "With SLEB-like Signed VBR for phi",
at the bottom.

It's certainly a bit better (1.3% uncompressed and 0.7% with gzip too),
but it's not always better, and I think it can still be done in a backward
compatible way.

I added numbers for what would happen if we didn't touch phis at all, too.

Finally, checked the average size of individual bc files from a build of
the llvm-test-suite (vs llvm-linked bitcode files), to make sure this isn't
really regressing sizes of small files. The average size of such
small bitcode object files is ~3.5% better. Didn't check compression
for that case though...

Very cool. This is the “obviously right” way to do it, I’m glad you’ve implemented it. It looks like Duncan gave you a lot of feedback already, I’m looking forward to seeing the updated patch. Thanks for tackling this Jan!

-Chris

Thanks for taking a look! I wasn't sure of better way to toggle this
behavior at the module level vs the function level. The options I toyed
with were:

  (a) bump MODULE_CODE_VERSION from 0 to 1
  (b) add a new optional module code to identify the change
  (c) stick with the approach in the first patch, which used a new function
block ID value, then check that all the function blocks uniformly have the
newer ID instead of mix of the old and new.

for option (a) there hasn't been a version bump before(?) and I wasn't sure
when that is appropriate.
for option (b) it seems like it's really indicating a version change and
not doing anything else, so could have been done more directly with (a)
for option (c) it ends up creating a new block ID that overlaps with the
existing function one and that would be first one to do that?

Thanks again Duncan for the review, and Renato for the suggestion of
looking into SLEB. Attached is an updated patch which toggles the change at
the module level (done with option a), and uses signed VRBs for phis,

- Jan

relative_ids.patch4 (35.6 KB)

Thanks for taking a look! I wasn't sure of better way to toggle this
behavior at the module level vs the function level. The options I toyed
with were:

  (a) bump MODULE_CODE_VERSION from 0 to 1
  (b) add a new optional module code to identify the change
  (c) stick with the approach in the first patch, which used a new function
block ID value, then check that all the function blocks uniformly have the
newer ID instead of mix of the old and new.

for option (a) there hasn't been a version bump before(?) and I wasn't sure
when that is appropriate.
for option (b) it seems like it's really indicating a version change and not
doing anything else, so could have been done more directly with (a)
for option (c) it ends up creating a new block ID that overlaps with the
existing function one and that would be first one to do that?

Thanks again Duncan for the review, and Renato for the suggestion of looking
into SLEB. Attached is an updated patch which toggles the change at the
module level (done with option a), and uses signed VRBs for phis,

This is awesome! I tried it by creating a .ll of all of clang. Some sizes:

* clang.ll 490M
* clang.ll.gz 55M
* clang.ll.bz2 37M
* clang.ll.xz 27M

with the current bitcode:

* clang.bc 82M
* clang.bc.xz 27M

with the new bitcode:

* clang.bc 70M
* clang.bc.xz 26M

it looks like a nop run of opt also got a bit faster, from 10.451s to 10.176s.

+static void EmitSignedInt64(SmallVectorImpl<uint64_t> &Vals, uint64_t V) {

Please start function names with a lower case letter.

+EnableOldStyleFunctions("enable-old-style-functions",

Maybe something like use-abs-operands is a better name now that this
is a global setting, no?

- Jan

Cheers,
Rafael

+static void EmitSignedInt64(SmallVectorImpl<uint64_t> &Vals, uint64_t V) {

Please start function names with a lower case letter.

Done -- changed this function and most of the "pushValue" functions. I
left PushValueAndType alone since that is an existing function and it
seemed like that would introduce other unrelated changes.

+EnableOldStyleFunctions("enable-old-style-functions",

Maybe something like use-abs-operands is a better name now that this
is a global setting, no?

Yes that would be more descriptive. Changed in new patch.

relative_ids.patch5 (37.7 KB)

Another option to deal with negatives is to use zig-zag encoding:
https://developers.google.com/protocol-buffers/docs/encoding
This "penalizes" positive ints with an extra bit and requires more
surgery - ymmv.

This looks good to me.

Just one question, you found that forward references are only common
with phi operands, so it is not profitable to use a signed
representation for other operands, right?

Cheers,
Rafael

Yes, I had about 133K hits for INST_PHI with a negative value, out of 136K hits of any “INST_.*” with a negative valued operand.

Overall there were 474K INST_PHI and 12 million “INST_.*” in my tests.

  • Jan

Yes, I had about 133K hits for INST_PHI with a negative value, out of 136K
hits of any "INST_.*" with a negative valued operand.

Overall there were 474K INST_PHI and 12 million "INST_.*" in my tests.

Cool!

Thanks again for working on this!

- Jan

Cheers,
Rafael

Thanks for the review Rafael!

Chris, did you want to take a look at the patch too?

Thanks,

  • Jan

Your most recent patch looks good to me, with a minor change:

+++ b/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -33,6 +33,13 @@
using namespace llvm;

static cl::opt
+UseAbsoluteOperandIDs(“use-abs-operands”,

  • cl::desc("Assign instruction operand IDs with the older "
  • "absolute encoding instead of using "
  • “relative IDs.”),
  • cl::init(false), cl::Hidden);

Please remove this, the new writer should just always generate the relative encoding. Please feel free to land the patch with this change.

Also, this is a significant enough thing that it would be good to add something to the bitcode docs (in the IR section, http://llvm.org/docs/BitCodeFormat.html#llvm-ir-encoding) that describes the old and new format.

Thanks again for tackling this!

-Chris

Thanks for the review Rafael!

Chris, did you want to take a look at the patch too?

Your most recent patch looks good to me, with a minor change:

+++ b/lib/Bitcode/Writer/BitcodeWriter.cpp
@@ -33,6 +33,13 @@
using namespace llvm;

static cl::opt<bool>
+UseAbsoluteOperandIDs("use-abs-operands",
+ cl::desc("Assign instruction operand IDs with the
older "
+ "absolute encoding instead of using "
+ "relative IDs."),
+ cl::init(false), cl::Hidden);

Please remove this, the new writer should just always generate the
relative encoding. Please feel free to land the patch with this change.

Also, this is a significant enough thing that it would be good to add

something to the bitcode docs (in the IR section,
http://llvm.org/docs/BitCodeFormat.html#llvm-ir-encoding) that describes
the old and new format.

Thanks again for tackling this!

-Chris

Thanks! Removed the flag and committed as 165739. Will update the docs.

- Jan