llvm-mc & Microsoft's MASM

Hi all,

I’m working on a project that uses clang-cl & lld-link to build for Windows, along with some tools out of the Windows SDK… but we’re currently pre-building some pieces of MASM assembly code using Microsoft’s ml.exe & ml64.exe. Unfortunately, it’s not all inline assembly, which clang can already handle, and Microsoft’s file-level directives are a bit unusual.

I plan to work on getting llvm-mc to compile (relatively simple) MASM files when targeting a Windows x86-based platform, with goal of matching the output of ml.exe and ml64.exe. I’ve already drafted a proof-of-concept patch that lets llvm-mc handle MASM’s variants of conditional assembly macros (including the idiomatic use of “ifdef rax” to check if a build is targeting x86-64)… but macro functions & structs are of course looking a bit harder.

A few questions:

  1. Should all of the changes be locked behind an equivalent to clang’s -fms-compatibility flag, or would it be good if some subset of the functionality were shared? [e.g., should .ifdef rax be a valid way to check if the rax register exists?]

  2. Is there anyone around who would be willing to answer questions regarding the intended architecture of llvm-mc and the AsmParser classes? I’d like to make sure my proposals fit well into the design… and I’m starting to have trouble finding where these extensions should go. (Also, I’ve had some trouble getting used to the recursive-descent parser conventions being used. For example, how should one handle “try parsing this identifier as a register, and if that fails, check if it’s defined as a symbol” while not emitting Errors from the first attempt?)


  • Eric

I don’t think it’s a good idea to alter what kind of syntax is accepted or not for files that aren’t explicitly opting into MASM behavior. The usual pattern we have done in the past for MS compatibility tools is to make a separate tool which is named the same as the Microsoft tool, but prefixed with lld-, llvm- or clang-. E.g. clang-cl, llvm-rc, llvm-mt, lld-link. So in this vein, I think it makes the most sense to have any future LLVM MASM compiler be called llvm-ml or llvm-ml64.

For your second question, I think many people on the list would be willing to & capable of answering questions. Just post your RFC and someone will look at it (it sometimes takes a few bumps & pings to get people to stop what they’re doing though)

Agreed, I won’t plan to change syntax for anything that hasn’t opted in.

However… Am I mistaken in thinking clang-cl (for example) is just clang with a different name, which triggers some variant behaviors including parsing cl.exe-style command lines and taking certain flags as implicit?

I was hoping to build llvm-ml similarly, by building the features into llvm-mc behind target selection and/or flags, then providing a different driver for implicit ml.exe compatibility.

(Also, are there guidelines for writing an RFC that people will bother with, or should I just try not to make it TOO long?)

Yes, that is correct. lld-link is also the same (which is the same as ld.lld and the other one), and so is llvm-lib (which is a synonym of llvm-ar). All of them work by switching off of the executable name by looking at argv[0] and then enabling functionality that way.

I don’t have a strong opinion on whether it’s reasonable to implement llvm-ml this way. It seems to make sense on the surface, but I’m not well versed in this area so someone else may have stronger opinions than I do.

Without question though, having MASM support somewhere in LLVM would be great, because this (and maybe the MIDL compiler) are the 2 big missing pieces I’m aware of for having a full cross-compilation toolkit for Windows.

The main issue is that llvm-mc has been written as a tool for
developers to write tests with rather than users. The user-facing
assembler so far is clang. I suppose he'll be implementing an entirely
new CLI anyway so that gets rid of the main point of tension though;
so it's probably not fatal.