Using Lisp to refactor C++

Hi,

I’m writing a general C++ refactoring tool in Lisp that I’m currently calling “Improve”.

Combining the dynamic, interactive language Lisp and the most excellent clang C++ matcher/refactoring library

  • it’s like a digital Reese’s peanut butter cup! (apologies to those with nut allergies)

An example Improve script is below - anyone who uses the C++ ASTMatcher library should recognize what I’m up to.
The session generated from the script is available at: https://dl.dropboxusercontent.com/u/6229900/session.log
Search for “REPL echo” to jump from command to command within the output.

Improve has the capabilities of “clang-query” but it also lets you write short scripts to do source-to-source transformation.
It lets you quickly and interactively write ASTMatchers, specifying bound nodes and then access information about those nodes in small blocks of code that display information about the nodes or generate replacements. It’s not pretty at the moment - the idea is a workshop/test-bench for interactively querying and modifying C++ source code.

It’s based on clang’s ASTMatcher/Refactoring library (thanks klimek, chandlerc, sbenza, pcc - I haven’t met you guys but hopefully I’ll make it to an LLVM meeting this year) with a bunch of lisp code to eliminate the C++ boilerplate required to write refactoring tools - and did I mention it’s interactive!

I’ll be open-sourcing it as soon as I finish using it to implement moving garbage collection in the Lisp system/compiler I wrote that it’s hosted within.

Thoughts and comments are welcome.

Best,

.Chris.

Christian Schafmeister
Associate Professor
Chemistry Department
Temple University

;;
;; Load the tooling code
;; Currently use load so we can edit/reload the code during development
;; Later just (require 'clang-tool)
;;
(load “src:lisp;clang-tool.lsp”)

;;
;; Load the JSONCompilationDatabase
;; This will fill the global variable $* with a list of all the source files in the database
;;
(load-compilation-database “src:main;compile_commands.json”)

;;
;; Set up a subset of 10 source filenames in $TEST to search over interactively
xp;; You can set up any number of source filename lists to run matchers over
;; for interactive matcher development
(lclear $test)
(ladd $test (subseq $* 0 10))

;;
;; Load the C++ ASTs for the filenames in $TEST
(load-asts $test)

#|
A demo ASTMatcher.
I’m looking for fields in class/structs where the class/struct does not inherit from StackBoundClass or GCObject
and the field has a type that contain smart_ptr’s that would be stored on the heap
such as vector where XXX is a struct/class that contains smart_ptr’s
The smart_ptr’s in question will be Garbage Collected when I don’t want them to be because
they won’t be connected to the root

(defparameter heap-smart-ptr-matcher
'(:field-decl
(:has-decl-context
(:record-decl
(:bind :outer-decl (:record-decl))
(:unless (:any-of
(:is-derived-from (:matches-name “.StackBoundClass.”))
(:is-derived-from (:matches-name “.GCObject.”))))))
(:has-type
(:has-declaration
(:class-template-specialization-decl
(:bind :named-decl (:named-decl))
(:has-any-template-argument
(:refers-to-type
(:has-declaration
(:record-decl
(:bind :recdecl (:record-decl))
(:is-derived-from
(:record-decl
(:for-each
(:field-decl
(:bind :leaf (:field-decl))
(:has-type
(:has-declaration
(:class-template-specialization-decl
(:matches-name “.smart_ptr.”)
))))))))))))))))

;;
;; Run the matcher on the currently loaded subset of ASTs
;; Using just the loaded subset of ASTs allows the matching to be fast and
;; enables interactive development of matchers. Once a matcher is written
;; it can be run on all source files using BATCH-MATCH-RUN (see below).
;;
;; Run code on each match, extracting info on the bound nodes using
;; mtag-xxx functions that take a node TAG that corresponds to a (:bind :TAG (NODE))
;; command in the matcher.
;; Print info on each match

(match-run heap-smart-ptr-matcher
:code #’(lambda () (format t “MATCH: ------------------~%~a~% :whole source-> ~a~%:leaf ~a~%~a~%”
(mtag-loc-start :whole)
(mtag-source :whole)
(mtag-source :leaf)
(list “:named-decl” (get-name (mtag-node :named-decl)))))
;; :limit 10 ; This would limit the max number of matches processed to 10
)

;;
;; Run the matcher on source files one at a time - this allows us to run
;; matchers on lots of source files without having to load all of their ASTs into
;; memory at one time

(batch-match-run heap-smart-ptr-matcher
:filenames $* ; $* is a global variable that contains a list of all source files
:code #’(lambda () (format t “MATCH: ------------------~%~a~% :whole source-> ~a~%:leaf ~a~%~a~%”
(mtag-loc-start :whole)
(mtag-source :whole)
(mtag-source :leaf)
(list “:named-decl” (get-name (mtag-node :named-decl)))))
)