Not so long ago we have found ourselves in need of a robust _eh_frame parser. All we wanted is the ability to parse .eh_frame section emitted by LLVM's MCJIT. Considering this, LLVM would be a natural place for implementing such parser.
Previous email thread about the issue (Redirecting to Google Groups) showed some interest among the community. Folks seemed to agree that DebugInfo is a good place to put implementation in.
Next there was submission by Pete Cooper (http://reviews.llvm.org/D15535). It completely covered the issue. However it was reverted due to some windows test failures and wasn’t resubmitted since. While I was adapting Pete’s parser for our needs I have found a couple of small issues in it. However I can’t check if they were causing original failures.
So the question is what is the right way of moving this forward? I can resubmit original Pete’s change with my fixes and see if it will cause any failures. Or maybe there are some alternative approaches which we haven’t thought about? Any opinions on this?
Not so long ago we have found ourselves in need of a robust _eh_frame parser. All we wanted is the ability to parse .eh_frame section emitted by LLVM's MCJIT. Considering this, LLVM would be a natural place for implementing such parser.
Previous email thread about the issue (Redirecting to Google Groups) showed some interest among the community. Folks seemed to agree that DebugInfo is a good place to put implementation in.
Next there was submission by Pete Cooper (http://reviews.llvm.org/D15535). It completely covered the issue. However it was reverted due to some windows test failures and wasn’t resubmitted since. While I was adapting Pete’s parser for our needs I have found a couple of small issues in it. However I can’t check if they were causing original failures.
So the question is what is the right way of moving this forward?
I would suggest first trying to reproduce the original issue on
windows and fixing just that.
Improvements can than be discussed independently.
Unfortunately I can't find any record of the original test failure. Is there a way to search for it in the buildbot archives?
I’m very sorry. You’ve both pinged me more than once about this and I haven’t yet been able to find the cause of the issue.
Igor, the failures I saw on a number of bots were complaining about one of the CHECK lines in the test case (test/tools/llvm-objdump/eh_frame-arm64.test)I adapted in the given review.
The line in question was:
# CHECK: DW_CFA_def_cfa: reg31 +0
but the bots complained saying that there was no 'DW_CFA_def_cfa: reg31 +0’ but there was a 'DW_CFA_def_cfa: reg0 +31’.
That led me to thinking there was an endian issue and that different bots were reading either 0 or 31 in the different parts of this instruction.
Unfortunately, that seemed likely given the first couple of failing bots were big endian, but then an x86 win7 bot failed, so endian issues are unlikely.
I’m going to take a look at the DWARF instruction parser again right now and see if there’s anything obviously going wrong.
One option may be to just not check the instructions in the test case for now, but commit the code again with all the other check lines in place. Then we can work on the instruction issue independently of everything else.
Note, I did take a look at the fixes you proposed. I need to write a test to ensure your handling of DW_EH_PE_omit is correct, but I think the changes are good. We can work on integrating those after landing the original patch (assuming its eventually ok to do so).
I finally got my hands on windows machine and I think I found the issue. See http://reviews.llvm.org/D16509 for the fix. Thanks for the detailed explanation. It really helped to identify what was going wrong.
There is some eh_frame section parsing going on in the linker which is
required for .eh_frame_hdr creation and probably useful for dead-frame
elimination in the future. It would be great if there was a single
parser used by all llvm tools.
There is some eh_frame section parsing going on in the linker which is
required for .eh_frame_hdr creation and probably useful for dead-frame
elimination in the future.
I think there's 2 linker parsers (a macho one and elf/coff), runtime dyld, and dwarf dump.
It would be great if there was a single
parser used by all llvm tools.
Absolutely. We'll get to that stage, I hope.
We'll need to work out if there's a way to abstract what we need from each client without slowing down or complicating any of them too much.
I was very curious so I went ahead and resubmitted Pete’s original change. So far no buildbot failures, looks promising.
Sent from my iPhone
It would be great if there was a single
parser used by all llvm tools.
Absolutely. We’ll get to that stage, I hope.
We’ll need to work out if there’s a way to abstract what we need from each client without slowing down or complicating any of them too much.
+1.
A common parser is definitely a good thing.
Agree. However currently DWARFDebugFrame doesn’t have any public interface. Would it be a good first step to extract some reasonable set of accessors and back them up by unit tests?
Maybe. For splitting the only part we need is finding where each CIE and fde is.
George wrote the code in lld that parses bits of .eh_frame and creates .eh_frame_hdr. I think all it needs to parse is the location of the program pointer for a fde.
So we have fairly small needs, so it would be nice to have an api that doesn’t parse everything upfront.
I was very curious so I went ahead and resubmitted Pete’s original change. So far no buildbot failures, looks promising.
Sent from my iPhone
It would be great if there was a single
parser used by all llvm tools.
Absolutely. We’ll get to that stage, I hope.
We’ll need to work out if there’s a way to abstract what we need from each client without slowing down or complicating any of them too much.
+1.
A common parser is definitely a good thing.
Agree. However currently DWARFDebugFrame doesn’t have any public interface. Would it be a good first step to extract some reasonable set of accessors and back them up by unit tests?
Maybe. For splitting the only part we need is finding where each CIE and fde is.
George wrote the code in lld that parses bits of .eh_frame and creates .eh_frame_hdr. I think all it needs to parse is the location of the program pointer for a fde.
So we have fairly small needs, so it would be nice to have an api that doesn’t parse everything upfront.
Cheers,
Rafael
For creating .eh_frame_hdr I was need to parse CIE to find pointer encoding for the address pointers used in the FDE. And then used it to read the initial PC, what was the main aim. Even the CIE augmentation string is not fully parsed because once we found the encoding we have nothing to do there and just exit. All unnecessary data was just skiped during parse process. So agree with Rafael about api.