[cfe-commits] r154321 - in /cfe/trunk: include/clang/Driver/CC1Options.td include/clang/Driver/Options.td include/clang/Frontend/CodeGenOptions.h lib/CodeGen/CGObjCGNU.cpp lib/Driver/Tools.cpp lib/Frontend/CompilerInvocation.cpp test/CodeGe

As requested...

r154321 contained a diff implementing a feature, requested by the Pajé developers, for visualising code flow in Objective-C programs on the GNU runtimes and tested by them after some discussion on the design.

Does anyone have any comments or objections?

David

It seems like this is unnecessary. Why not just provide a custom libobjc, interpose objc_msgSend, or such? I suppose this way you only get logs from /your/ code and not framework code...

(Also see http://www.dribin.org/dave/blog/archives/2006/04/22/tracing_objc/ -- there are a couple ways to trace ObjC messages /already/ in Apple's runtime...circa 2006, i.e. GNU runtime. Presumably these can be extended to non-Apple platforms as well.)

Jordy

As requested...

r154321 contained a diff implementing a feature, requested by the Pajé developers, for visualising code flow in Objective-C programs on the GNU runtimes and tested by them after some discussion on the design.

Does anyone have any comments or objections?

New features should meet the criteria outlined in http://clang.llvm.org/get_involved.html#criteria . It is not at all obvious how this extension meets those criteria.

Technically, why isn't Pajé simply interposing an instrumentation library to gather this information, either at link time or at run time? That would be less invasive for both users and for Clang developers, and far more likely to work correctly in general.

At the larger level, why should we be adding a separate code-generation mode for one visualization library for one runtime? It's not even the dominant runtime by far, so we're serving a very tiny user base with this new feature.

  - Doug

It seems like this is unnecessary. Why not just provide a custom libobjc, interpose objc_msgSend, or such?
I suppose this way you only get logs from /your/ code and not framework code...

You're answering your own question. Producing visualisations of a subset of the program is useful. The performance hit from tracing everything can affect the semantics of complex code, so turning it on and off at a finer granularity than per-process is useful. It's also useful for the usability of the resulting visualisations if you only want to see how a few classes interact.

(Also see http://www.dribin.org/dave/blog/archives/2006/04/22/tracing_objc/ -- there are a couple ways to trace ObjC messages /already/ in Apple's runtime...circa 2006, i.e. GNU runtime. Presumably these can be extended to non-Apple platforms as well.)

The Apple runtime is not the GNU runtime, the two implement different ABIs.

David

New features should meet the criteria outlined in http://clang.llvm.org/get_involved.html#criteria . It is not at all obvious how this extension meets those criteria.

As I understand it, and understood it when they were being drafted, those are the criteria for a new language feature. This is not a new language feature. It has no impact on parsing or semantic analysis. It merely provides some profiling hooks, which were requested by the community, designed based on feedback from the community, and implemented in the LLVM coding style.

Technically, why isn't Pajé simply interposing an instrumentation library to gather this information, either at link time or at run time? That would be less invasive for both users and for Clang developers, and far more likely to work correctly in general.

Not true. With the traditional GNU method two-stage dispatch mechanism it is not possible to interpose anything between message sends at the library level (the runtime simply returns a method pointer, so you can't run code after the message send, only at some arbitrary time before it). Even with a NeXT-style dispatch mechanism it is difficult to get enter and exit events, because the objc_msgSend() functions tail-call the real implementation.

This would also make it very difficult to selectively trace part of the program. This can easily be done with the current implementation, it would be impossible with a library-based implementation (we had a prototype that worked that way, and it had numerous limitations even when integrated into the runtime).

At the larger level, why should we be adding a separate code-generation mode for one visualization library for one runtime?

The patch adds hooks that any consumer can use. Pajé is the first to use them, but they can also be used by any other consumer of this information, including gprof-style tools.

It's not even the dominant runtime by far, so we're serving a very tiny user base with this new feature.

We're also not affecting any users of Apple runtimes, or, indeed, anyone who doesn't explicitly turn it on. This was presented in the GAP presentation at FOSDEM this year and there was a lot of interest from Objective-C developers in seeing it working.

Taking that argument to its logical conclusion, we're only serving a tiny userbase by supporting the GCC and GNUstep Objective-C runtimes at all, or by supporting, for example, platforms like NetBSD.

David

New features should meet the criteria outlined in http://clang.llvm.org/get_involved.html#criteria . It is not at all obvious how this extension meets those criteria.

As I understand it, and understood it when they were being drafted, those are the criteria for a new language feature. This is not a new language feature. It has no impact on parsing or semantic analysis. It merely provides some profiling hooks, which were requested by the community, designed based on feedback from the community, and implemented in the LLVM coding style.

I'll be happy to clarify the wording if it is actually unclear, but those criteria apply to any extension. Language extensions are called out specifically in several places because that's what I tend to hear most.

Technically, why isn't Pajé simply interposing an instrumentation library to gather this information, either at link time or at run time? That would be less invasive for both users and for Clang developers, and far more likely to work correctly in general.

Not true. With the traditional GNU method two-stage dispatch mechanism it is not possible to interpose anything between message sends at the library level (the runtime simply returns a method pointer, so you can't run code after the message send, only at some arbitrary time before it). Even with a NeXT-style dispatch mechanism it is difficult to get enter and exit events, because the objc_msgSend() functions tail-call the real implementation.

This would also make it very difficult to selectively trace part of the program.

Filtering for selective tracing seems like a rather critical feature for *any* tool that purports to provide whole-program analysis or visualization.

This can easily be done with the current implementation, it would be impossible with a library-based implementation (we had a prototype that worked that way, and it had numerous limitations even when integrated into the runtime).

The cost, of course, is that you have to recompile your application with a weird flag just to try out this one tool. That is a *horrible* workflow, even if it does provide modest gains.

There is clearly precedent for doing tracing of Objective-C applications without needing to hack up the compiler. You'll have to demonstrate conclusively why this is the best approach, because it feels far more like a hack to demonstrate that something can be done (which we don't want in the tree) than a feature that will benefit the Clang user community at large.

At the larger level, why should we be adding a separate code-generation mode for one visualization library for one runtime?

The patch adds hooks that any consumer can use. Pajé is the first to use them, but they can also be used by any other consumer of this information, including gprof-style tools.

This is very much "if you built it, they will come" argument. That doesn't happen very often in the world of software, particularly with special-purpose compiler flags.

It's not even the dominant runtime by far, so we're serving a very tiny user base with this new feature.

We're also not affecting any users of Apple runtimes, or, indeed, anyone who doesn't explicitly turn it on. This was presented in the GAP presentation at FOSDEM this year and there was a lot of interest from Objective-C developers in seeing it working.

As Chandler noted, the audience at FOSDEM does not control what goes into Clang. That's the responsibility of the Clang community.

Taking that argument to its logical conclusion, we're only serving a tiny userbase by supporting the GCC and GNUstep Objective-C runtimes at all, or by supporting, for example, platforms like NetBSD.

Honestly, what portion of the GNUstep community is likely to ever make use of this flag, much less actually use it on a regular basis? 0.1% at most? Should we really be adding compiler features that such a small group? I don't think we should.

  - Doug

Put like that, you're probably right.

David