Error "expected relocatable expression" with MCTargetExpr

What a coincidence! I happen to be looking at MCExpr as well and jotted down some notes ([VE] Move MCSymbolRefExpr::VK_VE_ to VEMCExpr:: · Issue #130003 · llvm/llvm-project · GitHub).

I think MCSymbolRefExpr::VariantKind is not an appropriate way to encode relocations.

  • other expressions, like MCConstantExpr (e.g., PPC 4@l) and MCBinaryExpr (e.g., PPC (a+1)@l), also need it
  • semantics become unclear (e.g., folding expressions with @).
  • The generic interface MCSymbolRefExpr offers no target-specific extension point. Any target-specific logic will pollute the generic interface.

Consider what happens with addition or subtraction:

MCBinaryExpr
  LHS(MCSymbolRefExpr): VariantKind, SymA
  RHS(MCSymbolRefExpr): SymB

Here, the specifier attaches only to the LHS, leaving the full result uncovered. This awkward design demands workarounds.

  • Parsing a+4@got exposes clumsiness. After AsmParser::parseExpression processes a+4, it detects @got and retrofits it onto MCSymbolRefExpr(a), which feels hacked together.
  • PowerPC’s @l @ha optimization needs PPCAsmParser::extractModifierFromExpr and PPCAsmParser::applyModifierToExpr to convert a MCSymbolRefExpr to a PPCMCExpr.
  • Many targets (e.g., X86) use MCValue::getAccessVariant to grab LHS’s specifier, though MCValue::RefKind would be cleaner.

Worse, leaky abstractions that MCSymbolRefExpr is accessed widely in backend code introduces another problem:
while MCBinaryExpr with a constant RHS mimics MCSymbolRefExpr semantically, code often handles only the latter.


MCTargetExpr subclasses, as used by AArch64 and RISC-V, offer a cleaner approach to encode relocations.
(MIPS migrated from MCSymbolRefExpr::VariantKind to MCTargetExpr in 2016 (⚙ D19716 [mips] Use MipsMCExpr instead of MCSymbolRefExpr for all relocations.).)
Ideally, limit MCTargetExpr to top-level use to encode one single relocation and avoid its inclusion as a subexpression.

MCSymbolRefExpr::VariantKind as the legacy way to encode relocations should be completely removed (probably in a distant future as many cleanups are required).

  1. Is there another way to handle such a relocation?
  2. If not, do you have any idea how to fix that condition?

The generic llvm/lib/MC/MCParser/AsmParser.cpp might be where MCSymbolRefExpr::VariantKind gets created. One fix could be tweaking the target-specific XXXAsmParser::parseExpression to replace it with a top-level MCTargetExpr. As for handling cases like x@gotpcrel+1 vs. x+1@gotpcrel, we could simplify by treating them the same—ignoring where @ sits in the expression. This feels similar to how the GNU assembler works, storing relocations in a relocs[] list and keeping the expression structure itself relocation-agnostic.

The generic llvm/lib/MC/MCParser/AsmParser.cpp creates expressions with MCSymbolRefExpr::VariantKind.
We should consider it apply to the whole expression, not just the subexpression.
This feels similar to how the GNU assembler works, storing relocations in a relocs[] list associated with the instruction and keeping the expression structure itself relocation-agnostic.

The target-specific XXXAsmParser::parseExpression can be customized to remove MCSymbolRefExpr::VariantKind and add a top-level MCTargetExpr.
As for handling cases like x@gotpcrel+1 vs. x+1@gotpcrel (x86), we could simplify by treating them the same-ignoring where @ sits in the expression.

You might look at PPCAsmParser::parseExpression for inspiration.
I am working on a refactoring to eliminate MCSymbolRefExpr::VariantKind::VK_PPC_* (GitHub - MaskRay/llvm-project at ppc-mcexpr ; will still use the VariantKind member in MCSymbolRefExpr; should switch to MCTargetExpr in the future, but I might not have energy to do that)

MCBinaryExpr(SystemZMCExpr(MCSymbolRefExpr(sym)), MCConstantExpr(8))

We should limit MCTargetExpr to top-level use to encode one single relocation and avoid its inclusion as a subexpression.

1 Like