[RFC] Allow recursive macros as extension

I propose to add preprocessor directive. All differences between #define and new directive - extension
do not prohibits recursive macro expansion.

Behavior:

  • extension do not changes any behavior of #define directive
  • recursion depth is limited, but limit is so huge it will be exceeded only with infinite recursion or with code which must never exist, because of that i do not add option for it
  • it is conditional to what must #define2 A A do, infinite recursion or replace macro with just A(because other usage is useless), in current implementation its diagnosed as infinite recursion

It’s very small and fits well into the existing macro expansion implementation, so it should be easy to maintain

At this point the implementation is ready, but for now the spelling of the directive needs to be discussed

Motivation and implementation are in the pull request

1 Like

I’ve been thinking about how we could make this work this better as an extension.
#define2 is not a great name and having multiple ways to define macro functions seems a bit invasive and confusing.

And the only thing #define2 does is not expand its name.

So I think there is another design option that is a bit easier to justify as an extension and is probably easier to implement.
We can introduce a magic preprocessor identifier, eg __CURRENT_MACRO__ (name subject to bikeshedding of course) that, in a macro, would denote that macro

Adapting your example:


#define FOLD_RIGHT(op, head, ...) ( head __VA_OPT__(op __CURRENT_MACRO__(op, __VA_ARGS__)) )
static_assert(FOLD_RIGHT(+, 1, 2, 3) == 6);

that way we could model __CURRENT_MACRO__ on __VA_ARGS__ and __VA_OPT__ in that it would only be meaningful in function-like macros.
Making it testable with __has_extension would improve portability for now.

It’s a design that is also probably to standardize in WG14/WG21 too (because it has a smaller surface area)

I’d like other vendors opinion too!

But overall, I think this is a problem worth solving!

3 Likes

@reinterpretcast @AaronBallman @shafik

But if there are second layer or recursion(two macro at once…), then __CURRENT_MACRO__ will be not expanded? And it will break recursion, but in many cases it might work.
I like idea, also i want behavior of # __CURRENT_MACRO__(args) and ## __CURRENT_MACRO__ (args) to be expanded always, behavior of the C preprocessor in these places is annoying and forces to create many helper macros
I hope it won’t be impossible to implement

Thank you for the RFC! This is an interesting idea

I have some high-level concerns:

  • The name define2 conveys nothing to the user as to how the feature works; we’ll have to pick a more descriptive name at some point.
  • I think macro expansion behavior should be predictable for the user, and I think we lose that property with this specific design. The user using a macro now has to know whether that macro was defined with #define or with #define2 to understand the expansion properties of the macro, and that’s a pretty high cognitive burden for users. The suggestion from @cor3ntin helps in this regard, but another approach that might work (or might be a terrible idea) would be to wrap the macro name with a directive at the point you want to force recursive expansion. e.g., recursively_expand(MACRO) (where recursively_expand is a preprocessor operator).

Also, we have a list of criteria for adding extensions to Clang. The items I have concern with are:

  • Evidence of a significant user community. Macros have existed in C for a long time and you can achieve non-infinite recursion with macros using the existing preprocessor functionality (and there are libraries which help you with this, such as P99). I’d appreciate more details on why these libraries are insufficient and this requires a language feature; infinite recursion is not possible (that’s why we have to add a recursion depth limit) and I believe it is rarely necessary, so this seems like a very specific feature for a pretty uncommon problem.
  • A specification. The preprocessor has some very curious properties that have led to implementation divergence over the years; we should nail down the behavior of any new preprocessor extension so that it’s clear how it behaves. This also helps with the next part…
  • Representation within the appropriate governing organization. We expect the preprocessor to remain broadly the same between C and C++ and the standards committees typically ask for that as well. So this feature will need some sort of proposal to both WG14 and WG21. That’s a tall order, but I think it’s critical if this extension is to be adopted by users – preprocessor differences between compilers can be a source of pain for users, and standardization helps to avoid that pain. I’ve not seen a proposal like this in my time on the committees, and I can’t find evidence that someone else has already proposed this idea. Getting feedback from the committees can be tricky though – WG14 (the C committee) wants to see implementation experience, but as implementers, we do not want to implement an extension to the preprocessor and have the committee(s) ā€œtweakā€ the design such that we break users, so we want some sign from the committee that the design approach is correct. I think starting a high-level discussion on both the committee reflectors could at least kick-start getting that design feedback. But even that is tricky because ISO has been far more strictly enforcing their rules about who can participate in standardization. I think we should circle back to figure out how best to interface with the standards committees once we think we’ve got a roughly final design for the feature.

I’d like to push back on that.
If we assume recursion in macro is useful - and i think the original PR had some examples, along with the mere existence of boost PP, P99, and similar facilities… - I had needed that a few times even though i try to limit my use of the preprocessor - then I think ā€œto call a macro recursively you need a libraryā€, is a tough sale.
But assuming you find a library with a suitable license and it gets blessed to be included in your company code base, or you reimplement it yourself, you end up with 2 issues:

  • The best interface you can get is APPLY(F, args) which kind of works but it would be more natural for F(args) to work
  • The libraries that exist work by generating (either manually or through a script in python, perl, cmake, etc) a list of N macros, N being how much nested calls they want to support.
    That produces unnecessarily large files that need to be pre-processed, unnecessarily causing the
    compiler to do more work and leading to bad diagnostics if there is an error in your pile of macros.

So the feature does sound well motivated to me, but i agree the design needs more explorations and starting a conversation with the different vendors and committee seems like the best next step.

4 Likes

Okay, I can see that logic, thank you! There’s a natural tension between ā€œyou can do this already and in fact people have provided libraries to do itā€ and ā€œwhy not make this part of the language?ā€ and the line is a bit fuzzy. I guess I see folks wanting to do less macro programming in C++ (e.g., C++'s treatment of macros in modules), so I tend to think the bar is higher here because this is in the preprocessor. However, existence of those libraries is a sign of a need in practice.

I don’t think user will be required to know more about define2/ define than now, he can think ā€˜its just somehow works, may be they generated 100’000 lines by Python script’

I will list the problems with the current state of recursion in preprocessor:

  • complicates implementation, this forces user to create worse solution, which will be less readable or less usable
  • if you use a library like boost PP or P99 first you need to learn how to use this library, and this can be very difficult, because such an implementation complicates the interface too
  • generated recursion usually very limited in numbers,10 to 100 or something like
  • it greatly increases the volume of the source code and increases the likelihood of errors.
    Imagine, you forgot to remove ā€˜,’ in one line, or the script generator inserts an extra comma in one edge case, when and how you will find this error?
  • if you use script to generate code, then this script is part of your project now, because if you want to change implementation in future you need to change this script

I agree with @AaronBallman that having two kinds of function-like macros could be a source of surprise and confusion. That being said, there are languages that support multiple kinds of macros with success; e.g., GNU make and its simply expanded and recursively expanded variables.

I’m more inclined towards solutions like those suggested by @cor3ntin and @AaronBallman, assuming they suffice for the desired use cases.

It might be useful to look at the features offered by the various Preprocessor libraries like PP99 and boost PP and survey how the proposed recursive macro would simplify/replace these features, it might help informing the design.

2 Likes

I think a better approach is to redefine macros with the _Pragma operator, my code isn’t complete, but feel free to peruse it, it’s in the _Pragma(redefine_macro) branch, and it’s written in the spirit of _Pragma(push_macro/pop_macro)

As for the recursive expansion aspect, I’m working on a directive #repeat that allows expansion a specified number of times, it’s in the #Repeat branch of my LLVM fork.

My primary use case is in registering test cases and test suites in C, and also to get around the issues with ā€˜ COUNTER’

Opinions? @tahonermann @cor3ntin @AaronBallman

Github fork here: GitHub - MarcusJohnson91/llvm-project: The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.

Design changed from new preprocessor directive to special token __THIS_MACRO__, which means recursive macro call.
Its enabled only in function-like macros, its ready, all changes in linked pull request