I was doing something similar last year and tried writing my own Fortran
lexer/parser and reusing some of the existing ones. I found it so hard that I
ended up rewriting the 800kLOC of Fortran code in a more modern language by
hand. Basically, the Fortran-related open source tools are so poorly written
and unreliable that they are not worth using. AFAIK, the llvm-gfortran
compiler is just an LLVM backend on GCC's Fortran front-end. GCC is awful so
I would not recommend trying to get anything sensical out of it.
One project I did have limited success with was g95-xml, which is a hacked
version of GCC's g95 compiler that can output the nearest thing Fortran has
to an AST as XML:
The "First attempts" version that I used was a Perl programmer's idea of a
parse tree though.
<statement id="0xbdf7b30" type="PROGRAM" loc="[0,6,0,18]"/>
<statement id="0xbdf8420" type="TYPE_DECLARATION" loc="[1,6,1,23]"
decl_type="0x705820" decl_kind="0xbdf7fe0" decl_symbols="0xbdf8290"/>
<statement id="0xbdf8f90" type="ASSIGNMENT" loc="[2,6,2,12]"
<expr id="0xbdf8100" type="VARIABLE" loc="[2,6,2,7]" symbol="0xbdf8290"/>
<expr id="0xbdf8b00" type="CONSTANT" loc="[2,10,2,12]" value="1.E+0"/>
<statement id="0xbdf9550" type="END_PROGRAM" loc="[3,6,3,17]"/>
The edges between nodes in the AST are represented by those hexadecimal values
(!). IIRC, after a lot of effort writing OCaml code to decipher that "XML", I
discovered that it did not, in fact, contain all of the information from the
source code and could not be used to perform the automated transformation
that I wanted.
So my advice is certainly to compile your Fortran into LLVM IR because that is
a far more sane and malleable format.