Source code variable value track in IR

Hi, I am a green hand in both LLVM and ir. My problem is that: Given varibale b and its line number 6 in demo.cpp file, I wanna know how to get its true value by using LLVM pass.

Source code:

#include <iostream>

int main() {
    int b;
    int a = 123;
    b = a;  // line 6
    return 0;
}

and I use command clang -O0 -g -S -emit-llvm demo.cpp -o demo.ll to get the demo.ll.

the main function IR:

; Function Attrs: mustprogress noinline norecurse nounwind optnone uwtable
define dso_local i32 @main() #4 !dbg !857 {
  %1 = alloca i32, align 4
  %2 = alloca i32, align 4
  %3 = alloca i32, align 4
  store i32 0, i32* %1, align 4
  call void @llvm.dbg.declare(metadata i32* %2, metadata !858, metadata !DIExpression()), !dbg !859
  call void @llvm.dbg.declare(metadata i32* %3, metadata !860, metadata !DIExpression()), !dbg !861
  store i32 123, i32* %3, align 4, !dbg !861
  %4 = load i32, i32* %3, align 4, !dbg !862
  store i32 %4, i32* %2, align 4, !dbg !863  // HERE is the assignment
  ret i32 0, !dbg !864
}

the bottom info IR: (I don’t know what they are, but I think they are important)

and My pass.cpp is uncompleted:

virtual bool runOnFunction(Function &F) {
  LLVMContext &Ctx = F.getContext();
  // errs() << "Function: ";
  // errs().write_escaped(F.getName()) << '\n';

  for (auto &B : F) {
    for (auto &I : B) {
      if (auto *op = dyn_cast<LoadInst>(&I)) {
        // Insert *after* `op`.
        IRBuilder<> builder(op);
        builder.SetInsertPoint(&B, ++builder.GetInsertPoint());
        // builder.CreateLoad();  // HERE I don't know how to write its type.
      }
    }
  }

  return false;
}

Could you tell me how to finish the pass.cpp ? Or give me some advide on questions:

  • If I have variable name b and line number 6, how should I do to get the true value 123?
  • How can I link line number of source code to IR instructions by LLVM pass?

Please help me! Any document or material or comment will help a lot! Thank you!

  1. to the comment // HERE I don't know how to write its type. in your source code:

    you can use your IRBuilder<> builder(op); to get types, for example builder.getInt32Ty() to return the llvm int32 type (see LLVM: llvm::IRBuilderBase Class Reference).

  2. to “How can I link line number of source code to IR instructions by LLVM pass?”:

    are you aware of that the IR instructions in your example are actually linked to the line numbers of the source code? The instruction %4 = load i32, i32* %3, align 4, !dbg !862 for example has the metadata !862, and from your screenshot I can conclude that it corresponds to line 6 of the source code. You can get the metadata of an instruction from the method getMetadata. I.e. op->getMetadata(1) inside your pass.cpp. It returns a MDNode (see LLVM: llvm::MDNode Class Reference) and from this you can get the informations that are dumped in your screenshot.

  3. to “If I have variable name b and line number 6, how should I do to get the true value 123?”:

    to get the int value 123 you have to find the instruction store i32 123, i32* %3, align 4, !dbg !861 from your IR example. Assume you found this instruction and stored it to StoreInst store, then you can use

    auto constant_int = dyn_cast<ConstantInt>(store->getOperand(0));
    int number = constant_int->getSExtValue();
    

    to get the integer value 123. For exactly the example you have the following should work to find this store instruction:

    int number;
    
    auto alloca = dyn_cast<AllocaInst>(op->getOperand (0));
    
    for (auto user: alloca->users()) {
      if (auto store = dyn_cast<StoreInst>(user)) {
        auto constant_int = dyn_cast<ConstantInt>(store->getOperand(0));
        number = constant_int->getSExtValue();
      }
    }
    

    This will work for exactly this example since there is only one load. If there are multiple loads and for some reason you know that the load you search for corresponds to line 6 then you would need to read the metadata from the instruction and check if it corresponds to this line.

1 Like

Soooooooooooooo much thanks to you! :sob::sob::sob: I really cannot express my gratitude to your kind and patient answer! I will carefully read your reply later!
Thank you a lot!!! :rose: :rose: :rose: :rose: :rose:

Sorry to disturb you. I read your reply for a whole day; your code works well and helps a lot. But I still have two problems.

  • correctly get the metadata;
  • get the type of String
  • get dynamic value.

My example demo.cpp described in the topic is simple. And what I wanna express is that the varibale b can not be read in IR, which can be shown in below code:

#include <iostream>

int main() {
    int b;
    int a;
    std::cin >> a;
    b = a;  // line 7
    return 0;
}

IR of the main function:

; Function Attrs: mustprogress noinline norecurse optnone uwtable
define dso_local i32 @main() #4 !dbg !857 {
  %1 = alloca i32, align 4
  %2 = alloca i32, align 4
  %3 = alloca i32, align 4
  store i32 0, i32* %1, align 4
  call void @llvm.dbg.declare(metadata i32* %2, metadata !858, metadata !DIExpression()), !dbg !859
  call void @llvm.dbg.declare(metadata i32* %3, metadata !860, metadata !DIExpression()), !dbg !861
  %4 = call nonnull align 8 dereferenceable(16) %"class.std::basic_istream"* @_ZNSirsERi(%"class.std::basic_istream"* nonnull align 8 dereferenceable(16) @_ZSt3cin, i32* nonnull align 4 dereferenceable(4) %3), !dbg !862
  %5 = load i32, i32* %3, align 4, !dbg !863
  store i32 %5, i32* %2, align 4, !dbg !864
  ret i32 0, !dbg !865
}

; METADATA
!857 = distinct !DISubprogram(name: "main", scope: !8, file: !8, line: 11, type: !539, scopeLine: 11, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !7, retainedNodes: !9)
!858 = !DILocalVariable(name: "b", scope: !857, file: !8, line: 12, type: !20)
!859 = !DILocation(line: 12, column: 9, scope: !857)
!860 = !DILocalVariable(name: "a", scope: !857, file: !8, line: 13, type: !20)
!861 = !DILocation(line: 13, column: 9, scope: !857)
!862 = !DILocation(line: 14, column: 14, scope: !857)
!863 = !DILocation(line: 15, column: 9, scope: !857)
!864 = !DILocation(line: 15, column: 7, scope: !857)
!865 = !DILocation(line: 16, column: 5, scope: !857)

I need to read dynamic value, so I instrument logvar function:

#include <iostream>
#include <string>

extern "C" void logvar(int i, std::string name) {
    std::cout << "Num: " << i << "; Name: " << name << std::endl;
}

And now, my pass.cpp is as follows:

virtual bool runOnFunction(Function &F) {
      // Get the function to call from our runtime library.
      LLVMContext &Ctx = F.getContext();
      std::vector<Type*> paramTypes = {
        Type::getInt32Ty(Ctx),
// Here I need a string type to match logvar function, 
// but I don't know how to write it.
      };
      Type *retType = Type::getVoidTy(Ctx);
      FunctionType *logFuncType = FunctionType::get(retType, paramTypes, false);
      FunctionCallee logFunc = 
         F.getParent()->getOrInsertFunction("logvar", logFuncType);

      for (auto &B : F) {
        for (auto &I : B) {      
          if (auto *op = dyn_cast<LoadInst>(&I)) {  // timo_schalachter
            int number;
            auto alloca = dyn_cast<AllocaInst>(op->getOperand(0));
            for (auto user: alloca->users()) {
              if (auto store = dyn_cast<StoreInst>(user)) {
                auto constant_int = dyn_cast<ConstantInt>(store->getOperand(0));
                number = constant_int->getSExtValue();
                errs() << number << "\n";
              }
            }
          }

          if (auto *op = dyn_cast<StoreInst>(&I)) {            
            errs() << *op << ".StoreInst\n";
            Value *val = op->getValueOperand();
            if (auto constant_int = dyn_cast<ConstantInt>(val)) {
              int number = constant_int->getSExtValue();
              errs() << number << ".\n\n";
            } else if (auto constant_fp = dyn_cast<ConstantFP>(val)) {
              float number = constant_fp->???;
// I cannot find the function to get the value of float point here.
            } // and how to deal with constant string?

            // metadata: 
            // store i32 %4, i32* %2, align 4, !dbg !863
            Value *arg1 = op->getOperand(0);  // %4 = xxx
            Value *arg2 = op->getOperand(1);  // %2 = xxx

            unsigned mk = op->getContext().getMDKindID("dbg");
            MDNode *mdn = op->getMetadata(mk);

            if (mdn) {
              Metadata *mds = mdn->getOperand(0);
              StringRef str;
              if (MDString::classof(mds)) {
                str = (cast<MDString>(*mds)).getString();
                errs() << str;
              }
            } else {
              errs() << "no dbg!\n";  
// when I run this code, it always says: "no dbg!"
// I don't know why...
            }

            // instrumentation
            IRBuilder<> builder(op);
            builder.SetInsertPoint(&B, ++builder.GetInsertPoint());

            Value* args[] = {arg1, ???};  
// I cannot find the name of variable. 
// I think it should be str in metadata. 
// But you know, I can not get it .
            builder.CreateCall(logFunc, args);
          }
        }
      }
      return false;
    }

Problems are in code. I cannot make myself understood clearly by using English.

If you know answer, PLEASE Help Me :face_holding_back_tears: when you are available.
And I am really new to llvm, so sometimes when I read document of LLVM, I cannot catch the key points completely or find the target function. Just like the method op->getMetadata(1) mentioned in the reply, I spent a lot of time to search and try how to use it. And unfortunately, it seems I still cannot get metadata properly :sob:

And any links or code snippets or even telling me key phrases to google myself are appreciated! Thank you thank you thank you!