[RFC] Per-operator granularity for BreakBinaryOperations

Motivation

BreakBinaryOperations (introduced in clang-format 20) provides OnePerLine and RespectPrecedence modes that control how binary operations are broken across lines. However, these modes operate at the level of all binary operators simultaneously, which is too coarse for many use cases.

In practice, different operator families have very different readability needs. Developers may want one-per-line formatting for &&/|| condition chains, but the same treatment applied to + concatenation or arithmetic may disrupt the logical grouping of their code. Today, the only workaround is wrapping blocks in // clang-format off / // clang-format on.

Example 1: Equality comparison chains

A common pattern in operator== implementations:

// With BreakBinaryOperations: RespectPrecedence — good:
return id == other.id &&
       version == other.version &&
       packetNumber == other.packetNumber &&
       operatingMode == other.operatingMode;

But enabling RespectPrecedence for && also forces one-per-line on other operators like + and | in the same codebase, which may not be desired for arithmetic and string concatenation.

Example 2: Stream extraction/insertion chains

C++ stream operator chains (>> for input, << for output) are a common idiom for serialization. clang-format currently packs them to the column limit:

// Current clang-format output — hard to read and review:
in >> packet.id >> packet.version >> packet.packetNumber >> packet.rangeScale >>
    packet.rangeDiscreteSize >> packet.numberOfRangeDiscretes >>
    packet.wsrStatusWord >> packet.currentAzimuth >> packet.currentElevation;

One-per-line is often preferred for readability and code review:

// Desired — each field on its own line:
in >> packet.id
   >> packet.version
   >> packet.packetNumber
   >> packet.rangeScale
   >> packet.rangeDiscreteSize
   >> packet.numberOfRangeDiscretes
   >> packet.wsrStatusWord
   >> packet.currentAzimuth
   >> packet.currentElevation;

Example 3: Bitfield assembly with | and <<

Composing hardware register values or bitmasks:

// Current: packed to column limit, hard to verify each field
std::uint32_t a = byte_buffer[0] | byte_buffer[1] << 8 | byte_buffer[2] << 16 |
                  byte_buffer[3] << 24;

// Desired: one component per line for easy verification
std::uint32_t a = byte_buffer[0] |
                  byte_buffer[1] << 8 |
                  byte_buffer[2] << 16 |
                  byte_buffer[3] << 24;

The core problem

All these cases share the same issue: different operators need different breaking policies, but BreakBinaryOperations only offers a single global setting. Enabling OnePerLine for specific operators forces it on all binary operators.

A related need is minimum chain length gating: short boolean expressions like a && b should stay on one line, but longer chains of 3+ conditions should break one-per-line. Today, OnePerLine applies uniformly regardless of chain length.

Proposal

Extend BreakBinaryOperations to accept a structured YAML configuration alongside the existing scalar form. The new form adds two capabilities:

  1. Per-operator rules (PerOperator): specify break style for specific operator groups
  2. Minimum chain length (MinChainLength): only trigger breaking when a chain has N or more operators

YAML syntax

The simple scalar form remains fully backward-compatible:

# Existing syntax — unchanged behavior
BreakBinaryOperations: OnePerLine

The new structured form:

BreakBinaryOperations:
  Default: Never
  PerOperator:
    - Operators: ['&&', '||']
      Style: OnePerLine
      MinChainLength: 3
    - Operators: ['|']
      Style: OnePerLine

Fields:

Field Description
Default Break style for operators not matched by any PerOperator rule. Accepts the same values as the scalar form: Never, OnePerLine, RespectPrecedence.
PerOperator List of rules, each with:
Operators List of operator token strings, e.g. `[‘&&’, ’
Style Break style for these operators (defaults to OnePerLine)
MinChainLength Minimum number of chained operators before the rule triggers. 0 (default) means always break when the line is too long.

Behavior examples

Configuration:

BreakBinaryOperations:
  Default: Never
  PerOperator:
    - Operators: ['&&', '||']
      Style: OnePerLine

Logical chains break one-per-line when they exceed the column limit, while other operators (like +) use the default Never and wrap naturally at the column limit:

// && chains — one-per-line
return id == other.id &&
       version == other.version &&
       packetNumber == other.packetNumber;

int sum = a + b + c + d;  // + uses default Never — no forced break

Multiple operator groups — combining &&/|| with |:

BreakBinaryOperations:
  Default: Never
  PerOperator:
    - Operators: ['&&', '||']
      Style: OnePerLine
    - Operators: ['|']
      Style: OnePerLine
// Both && and | break one-per-line, but + stays as default
int flags = FLAG_READ |
            FLAG_WRITE |
            FLAG_EXECUTE;

bool ok = isValid &&
          isReady &&
          isEnabled;

int sum = a + b + c + d;  // + uses default Never — no forced break

MinChainLength gating:

BreakBinaryOperations:
  Default: Never
  PerOperator:
    - Operators: ['&&', '||']
      Style: OnePerLine
      MinChainLength: 3
// Chain of 2 — below MinChainLength, stays on one line even though it could break
bool ok = conditionA && conditionB;

// Chain of 3+ — meets MinChainLength, triggers one-per-line
bool ok = conditionA &&
          conditionB &&
          conditionC;

Implementation

A working implementation is available as a PR: [clang-format] Add per-operator granularity for BreakBinaryOperations by ssubbotin · Pull Request #181051 · llvm/llvm-project · GitHub

The approach follows the AlignConsecutiveStyle pattern for dual-mode YAML parsing (scalar enum + structured mapping), so the simple scalar form remains fully backward-compatible — BreakBinaryOperations: OnePerLine produces {Default: OnePerLine, PerOperator: []} and behaves identically to the current enum value. All existing unit tests pass.

Open questions

  1. Should MinChainLength default to 0 or 2? Currently 0 (always break when line is too long). Defaulting to 2 would skip trivial a && b chains, which some users might prefer.
  2. RespectPrecedence in per-operator rules: The current implementation supports it, but the interaction between per-operator RespectPrecedence and the precedence grouping system could be surprising. Should we restrict per-operator rules to Never/OnePerLine only?
  3. Naming: PerOperator vs Rules vs Overrides — open to suggestions.