LLVM GPU Working Group Meeting – Friday, July 15, 2022

Hello,

The next LLVM GPU WG meeting is scheduled for this Friday at 11am ET / 3pm UTC.

The proposed agenda with detailed meeting information (video call link, calendar links, etc.) is available at LLVM GPU Working Group Meeting Agenda / Notes - Google Docs .

Currently, the main discussion item is:

  • opaque types / typed pointers for DXIL/SPIR-V/WebAssembly (context)

Feel free to add comments / suggestions to the document to propose new agenda items.

Hope to see you there,
Jakub

The meeting starts in one hour

Follow-up on the noundef topic discussion of yesterday’s meeting: I believe that instead of removing the noundef attribute from certain function arguments, we may want to instead freeze values passed to them.

Reasoning: At a source language level, the problematic rule is:

Loading from an uninitialized variable is undefined behavior.

LLVM does not follow this rule exactly, instead loading an undef/poison value which can delay the undefined behavior.

The question is, what would a relaxed source language rule look like? To avoid introducing the concepts of undef/poison to the source language, I’d expect a rule along the lines of:

Loading from an uninitialized variable is undefined behavior, except if the loaded value is immediately used as a “maybeundef” function argument. In that case, an arbitrary value is loaded (and each such load from the uninitialized variable may produce a different value).

Using freeze is a more faithful translation of this rule to LLVM IR. It also avoids the following problem:

int clamped_broadcast(maybeundef int x) {
  if (x < 0)
    x = 0;
  return __shfl(x, 0);
}

void kernel() {
  ...
  int x;
  if (lane == 0)
    x = ...;
  x = clamped_broadcast(x); 
  ...
}

With the tentative source language rules, this snippet has defined behavior.

Lowering to LLVM IR by only dropping the noundef attribute has UB because for lane != 0, there is a branch on undef/poison in clamped_broadcast.

Lowering to LLVM IR by freezing the argument to clamped_broadcast has the same defined behavior as the source language rules.

In any case, what this shows is that it seems wise to put a little bit more thought into what the source language rule would be. I do like my proposal above, but it’s literally the first thing I thought of so who knows :slight_smile:

cc @jdoerfert @arsenm