target-specific assembly printing

I’m trying to retarget LLVM to a TI processor. We plan to use our existing assembler, which is not gas-based. I’m finding a lot of inconsistency in the retargeting hooks available under the AsmPrinter interface. I’ll cite four examples:

  1. The TI assembler’s global directive is called ‘.global’ rather than ‘.globl’. This one’s easy: there is a getGlobalDriective() interface in MCAsmInfo.

  2. The TI assembler’s common directive is called ‘.common’ rather than .comm’. Oops, that’s hardwired into MCAsmStreamer.

  3. The TI assembler’s common directive does not imply global linkage; a separate .global is required. I was able to hack that by overriding AsmPrinter::EmitGlobalVariable().

  4. The TI assembler’s align directive is called ‘.align’ rather than ‘.p2align’ and its argument is absolute bytes, not a power of two. Oops, that’s also hardwired in.

Those are a few of the things I’ve encountered so far; I’m sure there are dozens more. I’m wondering how to handle these types of things in general. I could:

A) Keep extending MCAsmInfo for anything I need. This seems like it agrees with the intent of that interface, but I’m concerned it could be quite a few additions . I’m curious why it’s implemented as a bunch of scalar properties with getters rather than a general API.

B) Override much of the default ASMPrinter in the target version. It seems like this could end up duplicating much of what is in the base class. Also much of the implementation is (currently) private, so the target version cannot access it. Of course, there is ‘protected’.

C) Create a new implementation of the MCStreamer interface, similar to MCAsmStreamer, that would support the TI assembler. This seems overkill, as the formats are not that different. It also seems like it defeats the purpose of the MCAsmInfo interface.

There are no plans to upstream any of this in the near future, but it could happen someday, so I’m looking for the most agreeable way forward. Thoughts?


I have experienced the exact same problem. The class ‘MCAsmStreamer’ does almost everything we need for our custom assembler, but unfortunately this class is privately implemented totally within ‘MCAsmStreamer.cpp’. Ideally this class would be advertised in a header because it is so immensely useful, and then custom assemblers such as yours and mine could simply specialise it with adaptations of the elements that differ.

A few years ago I did separate this into a header and an implementation file, but it proved too difficult to track the ongoing changes in the LLVM head sources and I gave up (one of the downsides of being out-of-tree). Now I have a few “hacks” to the implementation of ‘MCAsmStreamer’ to adapt it to my needs which goes against the purity of interface versus implementation that I would prefer.

Writing my own pure custom streamer derived from ‘MCStreamer’ was massive overkill when the existing ‘MCAsmStreamer’ already did 98% of what I needed, and simply not worth the trouble versus using the hacks when tracking ongoing LLVM developments was also required. And because ‘MCAsmStreamer’ is in an anonymous namespace in another file, I can’t even delegate to it.

Incrementally over the years, we have been altering our native assembler to be more conformant to the assembly language syntax supported by ‘MCAsmStreamer’, but this always causes pain to our customers who have to alter their hand-written assembly files and inline assembly to match the breaking changes we make. But these changes have reduced our divergence from the ‘MCAsmStreamer’ expectation from about 20 differences to just 5.

Now we have hacked implementations of ‘EmitAssignment’, ‘EmitBytes’, ‘emitFill’ (inconsistent naming), ‘EmitValueToAlignment’ (for exactly your point #4 - an unexpected breaking change made between LLVM v3.8 and v3.9) and ‘EmitDwarfLocDirective’. In each case these are virtual functions, and a simple override rather than a hack would have been sufficient.

I would be very much in favour of promoting this very useful private class to 1st class citizen within the LLVM infrastructure, it would resolve the need for me to make target specific changes to the core implementation, and would also in great likelihood allow you and others to specialise just those elements that differ for your particular assembler dialect.

The class ‘MCAsmInfo’ could also do with some enhancements, a very trivial one that was relevant to me a while ago, was the assumption that indentation should use a TAB rather than spaces (this impacted the assembly printer). I had to hack/adapt this too, though an extension to ‘MCAsmInfo’ to allow the indentation separator to be provided by the target would be way better, especially for VLIW targets such as ours.