I want to obtain the execution time of one specified library function, such as CUDA library function.
My thoughts is: 1) use clang to compile the library function to IR;
2) instrument the IR, here is my unknown. I have no idea about how to instrument the instruction relative to execution time. Or is this approach feasible? If there is function supporting timestamp, I want to insert the start-time timestamp to the begin of the library function, and insert the stop-time timestamp to the end of the library function, then compute the interval between two timestamp.