AST transformations for deterministic parallel extenstion

I could use a little advice....
I am implementing some parallel extensions using clang, one of which is a parallel foreach of the form:

void doParallelForeach() {
    foreach (int i in 0, 10) {
        printf("Parallel %d\n", i); // runs in parallel
        }
}

I want to use Intel TBB as the runtime to accomplish this. A TBB parallel for looks like:

struct parallel {
    void operator()( const tbb::blocked_range<int>& range ) const {
    printf("Parallel %d\n", range.begin());
    }
};

void do parallelTBB() {
    parallel_for(blocked_range<int>(1, 10), parallel() ); // standard tbb
}

So I closely followed ForStmt and made a

class ForeachStmt : public Stmt

that captures the index and the body etc.

Then I modified the Parser and added Sema::ActOnForeachStmt

So my question is this.... How do I transform this Foreach statement at this point
in order to:
     1. Inject struct and operator() definition in the proper (global?) scope with the body of
the foreach *moved* there?
     2. Modify the body to call to create the range and call the parallel_for template that is defined in a tbb header?

So I should also ask, as a strong systems programmer but without a compiler background: Am I on the right path and working in the right
parts of clang or are there other, better ways to get this done?

Thanks...

D