This seems better than removing the OutlineableOpenMPOpInterface. I am still not sure whether hoisting the allocas outside the omp.distribute would work correctly. But it seems you have this all worked out.
I think that shouldn’t be an issue because the distribute construct just distributes iterations of the associated loop across the various teams spawned above by a teams construct, just the same relationship as do/for and threads defined by parallel. As I understand it, distribute and do constructs don’t really imply a new data scope, but teams and parallel do. Since we wouldn’t be reordering these parallelism-generating constructs, allocations would still happen at the expected thread scope.