Driver: Adding a different file extention for assembler source

We have a custom codegen and custom assembler. The custom codegen takes as input an s-expression variant of some of LLVM IR and produces our current assembly language. I would like to avoid bundling everything in one action/job to make moving away from this model easier.

Is there a relatively simple way to have the driver generate a different file extension for its “assembly” files? Then I can have the driver check the extension and make a (save-able for debugging) intermediate ‘.s’ file while fitting into the assembler tool paradigm and saving temps. We also do have ‘.s’ and ‘.S’ files to manage.

My goal is making incremental motion to LLVM codegen paths more sensible. We can twiddle our codegen to take MIR-style inputs piecemeal, I hope. Our codegen is BURG-based and has a considerable number of rules to translate. Finding steps on the way would be terribly useful.

I’ve been looking at the DirectX/DXIL backend (thank you @beanz and How could I tackle a simple "pass-through" code generator? - #4 by beanz !). But I’m lost on simple things like file extension translation. I’m ok with adding our own file extension to Types.def, etc. as an intermediate step. I would prefer not having ‘.s’ be our intermediate textual form but rather our assembly textual form. Similarly, I can trick out our assembler to translate a magic extension into a ‘.s’ file (declared as a save-able temporary in the driver somehow) while producing the ‘.o’ output.

But does anyone know how can I massage cc1 into producing a different assembly file extension?

Likely the answer is somewhere in clang/lib/Driver/ToolChains/, but I haven’t found it. I’m hoping the response is far, far shorter than my question.

1 Like

For mapping file extensions to input types, you probably want ToolChain::LookupTypeForExtension, which you can override in your toolchain implementation (Darwin.cpp has an example of this). Although if you’re defining a new input type, rather than leveraging the existing assembly type, you might need to fiddle with something deeper down.

As for setting the assembler extension on output, I suggest grepping for OPT_S and see where that leads you. I see mentions in Driver.cpp and ToolChains/Clang.cpp.

There’s a table in clang/include/clang/Driver/Types.def, used in clang/lib/Driver/Types.cpp, suffix returned by getTypeTempSuffix. The output asm extension seems to be from getTypeTempSuffix in GetNamedOutputPath in Driver.cpp. But this is just from a quick experiment.

Thank you! Gives me more starting places.

I wish the compilation chain framework were a bit more flexible, but I don’t have time to tackle that right now.