Machine Level IR text-based serialization

Hi,

I'm Alex, and last summer I was working on the code coverage support
for pgo for LLVM and clang at Apple.

I'm currently interning at Apple again, and I'm planning to work on a
project that was suggested to me by Bob Wilson.

I plan on developing a text-based, human readable format that allows LLVM to

serialize the machine level IR (The data structures like
MachineFunction, etc. from CodeGen).

The motivation for this project is that it will enable easier testing
of the CodeGen passes,

and will allow developers to run tests that invoke just a single CodeGen pass,

or that start from a specific pass (llc's -start-after option).

I would be interested in hearing from other people who might be
working on a similar

project or have ideas in which direction this project should go. I
will be posting

an initial proposal for the format in a couple of days.

Thanks,

Alex

From: "Alex L" <arphaman@gmail.com>
To: "LLVM Developers Mailing List" <llvmdev@cs.uiuc.edu>
Sent: Tuesday, April 21, 2015 12:12:40 PM
Subject: [LLVMdev] Machine Level IR text-based serialization

Hi, I'm Alex, and last summer I was working on the code coverage
support for pgo for LLVM and clang at Apple. I'm currently interning
at Apple again, and I'm planning to work on a project that was
suggested to me by Bob Wilson.
I plan on developing a text-based, human readable format that allows
LLVM to serialize the machine level IR (The data structures like
MachineFunction, etc. from CodeGen). The motivation for this project
is that it will enable easier testing of the CodeGen passes, and
will allow developers to run tests that invoke just a single CodeGen
pass, or that start from a specific pass (llc's -start-after
option).

Great! This is very badly needed.

I would be interested in hearing from other people who might be
working on a similar project or have ideas in which direction this
project should go. I will be posting an initial proposal for the
format in a couple of days. Thanks, Alex

My one requirement is that it support references to the IR (we still need IR-level aliasing analysis during CodeGen, and so the MachineMemOperand data structures need to be available and point to the right things).

Thanks again,
Hal

This is very needed. Thank you for working on this.

+1, I’d love this to happen!

Hi Alex,

I plan on developing a text-based, human readable format that allows LLVM to
serialize the machine level IR (The data structures like
MachineFunction, etc. from CodeGen).

The motivation for this project is that it will enable easier testing
of the CodeGen passes,
and will allow developers to run tests that invoke just a single CodeGen pass,
or that start from a specific pass (llc's -start-after option).

I would be interested in hearing from other people who might be
working on a similar
project or have ideas in which direction this project should go. I
will be posting
an initial proposal for the format in a couple of days.

I have been working on a textual Machine IR for the past few months to
facilitate both testing and use of the internal Machine IR in external
tools related to code generation.

The solution we've employed isn't nearly feature complete and it's missing
a lot of data, including type data and globals/metadata from the IR, but it's
capable of performing the task you describe using -start-after, as well as
printing the Machine IR after a given pass with -stop-after.

Unfortunately I can't share any code as this is (currently) a closed
project, but I'd be very interested in hearing about/discussing the approach
to this problem and will be looking forward to hearing more about this in
the future!

That is fantastic!

-Krzysztof

That's great that you've got it working! I would be interested in hearing
more details
about the format and the approach that you chose.

Btw, my proposal is now up:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-April/084932.html.
It's quite high level right now, but I would be interested in hearing your
thoughts on the
proposed format.

Cheers,
Alex.