[RFC] [tools] Changing Behavior of LLVM binutils When No File Is Specified

Some binutils, nm comes to mind, when to input file is specified will default to a.out. Others do not do this and read from stdin by default. The rest of this email will be specifically referencing these which read from stdin, and not the tools which for various reasons do things differently.

I propose that we change the behavior of these tools to use a.out when appropriate. By appropriate I mean firstly, no input file specified and also no stream redirection to stdin. These are the file types of stdin in these scenarios:

$ writes-to-stdout | prog # named pipe (fifo)

$ prog < file # regular file
$ prog # reading from tty, character device

Perhaps if stdin is a named pipe or regular file, then the default behavior should be as it always was, to read from stdin, but if stdin is a tty then we should use a.out as a default file. This lets these tools act the same as their gnu counterparts (when meaningful) but also adds what I think is convenient to not have to specify a.out.

This proposal would look something like this:

$ llvm-strings # not meanigful to read from stdin here, look for a.out
$ llvm-strings < file # use stdin

I may have got the behavior of stream redirection wrong here, or missed a situation when reading from the terminal is useful. I would love some feedback.

Best,
Alex

Some binutils, nm comes to mind, when to input file is specified will default to a.out. Others do not do this and read from stdin by default. The rest of this email will be specifically referencing these which read from stdin, and not the tools which for various reasons do things differently.

I propose that we change the behavior of these tools to use a.out when appropriate. By appropriate I mean firstly, no input file specified and also no stream redirection to stdin. These are the file types of stdin in these scenarios:

$ writes-to-stdout | prog # named pipe (fifo)

$ prog < file # regular file
$ prog # reading from tty, character device

Perhaps if stdin is a named pipe or regular file, then the default behavior should be as it always was, to read from stdin, but if stdin is a tty then we should use a.out as a default file. This lets these tools act the same as their gnu counterparts (when meaningful) but also adds what I think is convenient to not have to specify a.out.

This proposal would look something like this:

$ llvm-strings # not meanigful to read from stdin here, look for a.out
$ llvm-strings < file # use stdin

I may have got the behavior of stream redirection wrong here, or missed a situation when reading from the terminal is useful. I would love some feedback.

Best,
Alex

Does anyone actually use the default to a.out behavior? I think it would be much friendlier to just print “file or pipe expected” and then print the help.

  • Michael Spencer

Does anyone actually use the default to a.out behavior?
This is a good point. What bugs me is continuity across the tools, I agree with you that it isn’t particularly useful to default to a.out. But we are kind of stuck with the weird way that GNU’s binutils do things. I am not in favor of llvm-objdump defaulting to a.out, but llvm-readelf giving this warning message. My guess is that moving away from using a.out by default for llvm-objdump, llvm-nm and others whose gnu counter parts do this would be more disruptive than my proposal.

As Jake pointed out we use “-” to describe stdin/stdout, GNU’s tools do not do this, so I think there is some precedence for us slightly modifying behavior when we can reasonably say things will not happen, like the file “-” existing.

GNU addr2line, nm, objdump, and size default to a.out when no input file is specified.
Among llvm binary utilities, llvm-nm llvm-objdump llvm-size llvm-dwarfdump default to a.out.

I agree with Michael that a.out behavior may not be used by many people. If people don’t care too much
about these utilities’ compatibility with GNU, deleting the default a.out LGTM.

(I am opposed to make more utilities default to a.out)

Sounds good, I’ll work on removing these in favor of defaulting to stdin then?

I think it would be much friendlier to just print “file or pipe expected” and then print the help.

Do you have thoughts on this? I’m not sure there is a clean way to do this, the cleanest would be through getFileOrSTDIN(), but I’m not sure all of its users want this behavior. I don’t think we need it personally, but if you think its a good quality of life change to the tools and worth working on, I’d be happy to do so.

I agree with others that the a.out behaviour is weird (I’ve even thought this about the linker output being called a.out since I started programming, but perhaps that’s a different story). The use-case I can imagine is something like:

$ ld.lld test.o
$ llvm-objdump -d

I.e. using a tool immediately after generating the linker output. However, I don’t think this is likely actually done by anybody, and I think it’s probably safe to change the behaviour here. I’d rather be universal across the tools (e.g. always read from stdin) than do this frankly. An error message about a missing input file might be nice in some cases (llvm-readelf, llvm-objdump, llvm-nm), but I think we definitely want to support the option of stdin redirection if nothing else but for test purposes. If it’s complicated to achieve both the error under non-redirection cases, and no error in redirection cases then I’d prefer the latter.

Regards,

James

Like James, Michael suggested, I would prefer the default behavior be consistent across tools (either always read from stdin or no default at all /give error messages)

I think consistency is key and agree that people are likely not using this anywhere. It’s better to go after the ideal at first but we need to be willing to offer some path forward on these tools if it turns out large code bases that can’t be easily modified use this trick. So I suppose we should do it but be willing to switch back.

I have been working towards this on D63859. My current route was to add an optional callback to MemoryBuffer::getFileOrSTDIN which will be executed if stdin has not been redirected. James and I were talking over there and are maybe deciding that this might not be the best solution. The alternative in my mind is that the tools which want this behavior could do so by explicitly testing if Process::StandardInIsUserInput rather than let getFileOrSTDIN do it. This might be the better approach than passing a callback to getFileOrSTDIN which perhaps not all tools will use anyway. I am leaning towards doing it outside of getFileOrSTDIN. Does anyone have a preference either way, or a better solution than the above two?

Best,
Alex

I created a patch for just llvm-nm for now on D64290.

One topic that came up at the euro llvm round table was configure script compatibility, e.g. even if no human ever runs “ld foo.o; objdump -d”, there might be a configure script that tests for binutils that way. However, I don’t often use configure scripts, so I can’t say how real this concern is.

In Alex’s attempts to update llvm-objdump to remove a.out support, Saleem raised some objections. @Saleem, could you give some more details on here with your objections, and where you think these use cases are in practice, please?

@Alex, given Saleem’s objections, I think you should revert r365889 for now, so that there’s no risk of it getting into the release branch whilst this is still under discussion. I think the worst case is for us to end up with a mixture of behaviour (i.e. some tools keeping GNU compatibility here, and others not), in a release.

James