Question about storing data in AST node

Hi all,
I have a question about storing data in AST node.

Specifically, I’m wondering if it’s acceptable to directly store collections of custom structs parsed from a string within an AST node.

For context, I noticed that GccAsmStmt stores the complete AsmString instead of the AsmStringPiece. This leads to redundant parsing (AnalyzeAsmString) later in the clang code generation phase to obtain the AsmStringPiece.
It feels it is not good to save complex data to an AST node.

I’m currently working on supporting root signature (Specifying Root Signatures in HLSL - Win32 apps | Microsoft Learn) in HLSL, which involves generating collections of custom structs from a string.
My plan is to create a new AST node for the root signature.
But unsure what to save in the AST node.

One solution could store the customer structs generated from paring the input string directly in the AST node.
This only needs to parse the string once since the result is saved in AST node for later use.

Another solution is doing the same thing as GccAsmStmt, just save the whole string and parse the string twice once for diagnostic and once for clang codeGen.

Which solution is closer to the correct Clang approach?

Thanks
Xiang

The latter is definitely our common approach. The reason for keeping those strings around is because we need to be able to ‘recreate’ the AST node as close to the original code as possible for things like AST printing/etc.

That said, some level of ‘caching’ of the parse is ALSO something we do, see how constant expression evaluation can ‘store’ its value.

That said, whether or not caching is valuable is very dependent on how expensive the parsing is. GCC Asm Statements are really cheap to parse, so there is no reason to store their result (and also, the way they were developed, we never considered doing so, as the semantic analysis was added significantly after the fact).

Thanks for the answer.
That is really helpful.

I’ll weigh the cost to determine whether to cache the result or not.