Could LLVM or Clang go backward to modify c source code?

Hi all,

I have a source code:
int* p = malloc(…);

After translating to llvm::Module, We can know the source code did not call free.
I know we can hack this into Module.

But my question is,
could we hack it back to source code?
After fixing, the source will become:
int *p = malloc(…);
free(p);

Is this feasible?

Thanks all very much.

Forgot to CC the list.

OK. Thank you.

What I want to do is: Fix source code in-place, then feed it to compiler normally.
I assume I can fix source code in-place using clang::SourceManager, but I cannot find the appropriate API.
Now I know that way is infeasible.

Thanks again.

2011/5/5 Joshua Warner <joshuawarner32@gmail.com>

OK. Thank you.

What I want to do is: Fix source code in-place, then feed it to compiler normally.

Yes, this is possible.

I assume I can fix source code in-place using clang::SourceManager, but I cannot find the appropriate API.

You will want to use the rewriter API in clang, modify the code in a buffer, and then tell clang that the file is out of date and to use the in-memory version.

Now I know that way is infeasible.

It isn't infeasible, and APIs exist for doing it, however the general problem of inserting missing free statements is hard. You will need to perform escape analysis in the clang AST, which requires cross-module code flow checking. If p is passed as an argument to a function, does the function store it somewhere? If you have an allocator function, then it will malloc() some memory, initialise it, and return it. Where do you insert the free() call?

David

Thanks again.

Wen-Han,

It sounds like there are two problems here: first is detecting when free is not properly called, and figuring out where to insert the call - which seems like an intractable problem by iteself. Second is a reversible translation from C to LLVM IR. LLVM does (did?) have support for generating C code again on the back end, but the result doesn't look that much like the original code. In order to translate changes from the IR back to the source code, detailed information would have to be kept on what IR structures resulted from what source statements and how they interact. There has been work on such reversible transformations done (I can't recall where atm), but that was only in very simple cases that could be described by regular expressions.

So the short answer is "no."

-Joshua

Hi all,

I have a source code:
int* p = malloc(...);

After translating to llvm::Module, We can know the source code did not call free.
I know we can hack this into Module.

But my question is,
could we hack it back to source code?
After fixing, the source will become:
int *p = malloc(...);
free(p);

Is this feasible?

Thanks all very much.

--
Best regards,
Wen-Han

_______________________________________________
LLVM Developers mailing list
LLVMdev@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

--
Best regards,
Wen-Han
_______________________________________________
cfe-dev mailing list
cfe-dev@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-dev

-- Sent from my Difference Engine