Possible bug in DFAPacketizer::ReadTable

Hi all,

I have found what I think it is a bug in DFAPacketizer::ReadTable.

When finding NextStateInTable to cache all transitions belonging to a state into CachedTable, ReadTable does not check bounds:

unsigned ThisState = DFAStateEntryTable[state]; |
unsigned NextStateInTable = DFAStateEntryTable[state+1];

which makes NextStateInTable get a random value when state == . Behaviour changes depending on gcc version / platform / …, but in some cases might lead to segmentation faults.

I have checked the problem happens in Hexagon tests (for example fadd.ll test) but does not break badly there (though CachedTable will get some unneded and random data rows).

Probably making tblgen add an end-of-table marker in DFAStateEntryTable is the easiest solution.

BR

Carlos

Hi Carlos,

Thanks for identifying the bug. I'll confirm and fix. Is there a bug report open for this?

-Anshu

Hi Anshu,

no, I did not fill a bug report. It is not so easy to make the code fail noticeably; during Hexagon CodeGen tests it happens silently and tests pass. I am working on another VLIW backend which uses DFAPacketizer and compiling llvm with gcc-4.4 makes it segfault, but with gcc-4.7 the bug gets hidden again (it still happens, but values after DFAStateEntryTable in memory are such that ReadTable does not break bad).

If you find more info will help you track it down just let me know :slight_smile:

BR

Carlos

Carlos,

I committed a fix in r169783. Thanks for catching this.

However, I could not reproduce an invalid read or a segfault even with fadd.ll. Is there a test case you can check in that reproduces this bug? Even if the segfault occurs intermittently, that's better than no test case at all.

Thanks
-Anshu

Hi Anshu,

I got a testbench which fails (and segfaults) consistently with an environment (gcc + os) conveniently preserved in a virtual machine. I will confirm that it is gone there and report.

Thanks for the fix :slight_smile:

Carlos

Hi again,

I can confirm r169783 fixes the problem. My testbench segfaulted in r169782 but works after your commit.

We can close the issue.

Thanks,

Carlos

Great! Can you please check in that test case or better still, a reduced version of that test.

Thanks
-Anshu

Hi Anshu,

the “test case” I referred to requires the compilation of our whole back-end. It segdaults when using gcc-4.4.3 under Ubuntu 10.04, with other combinations I have tested it still happens (before your patch) but is not noticeable unless using gdb. I have tried making valgrind catch it but no success… so I guess the only way to see it is using the debugger.

I remember seeing it happen (though not segfault) on Hexagon as well but now I seem unable to reproduce it :S

BR

Carlos

I see. I couldn't reproduce the invalid read on Hexagon either. I'll try to reverse engineer a test case. Not having test coverage for a potential segfault makes me extremely wary.

Thanks
-Anshu