Why LLVM ir change variable's name slightly?

Source code:

    switch (m_check_state)  // process_read
    {
      case CHECK_STATE_REQUESTLINE:
      {
        break;
      }
      case CHECK_STATE_HEADER:
      {   
        break;
      }
      default:
        return INTERNAL_ERROR;
    }
  

IR:

  %m_check_state8 = getelementptr inbounds %class.http_conn, %class.http_conn* %this1, i32 0, i32 12, !dbg !7000
  %6 = load i32, i32* %m_check_state8, align 8, !dbg !7000
  switch i32 %6, label %sw.default [
    i32 0, label %sw.bb
    i32 1, label %sw.bb13
    i32 2, label %sw.bb22
  ], !dbg !7001

I have used -O0 -g -fno-discard-value-names to generate IR.

And the variable m_check_state has appeared in the source code several times. Their IR name are all m_check_state, except here named m_check_state8.

Why does LLVM change a variable’s name slightly?

there is no guarantee on the name of LLVM values, it can be arbitrary names depending on how the frontend (in this case, Clang) wants to generate. In release build (of LLVM) those names can even be just numbers.

Just in case you wonder, value names are not how LLVM passes down debug information from the source code. LLVM uses special metadata to pass source code information like line numbers and source variable names to later compiler pipeline.

Thank you! And why LLVM use arbitrary names rather than its original names? In my opinion, it’s a little weird to add some unexpected numbers to form a new name.

LLVM IR uses SSA form, which means that each named register can only defined in a single place (per function).

So Clang is probably giving “%m_check_state” as a suggested name but it’s already been used in this function so LLVM keeps adding 1 until it eventually lands on one that’s unused so far: “%m_check_state8”

1 Like