Advice on architecture research project?

I am interested in working on a little architecture project that
involves modifying an ISA in some non-trivial ways and seeing what
impact it has on instruction frequencies (and other such metrics).
Clearly I'll need to hack on a compiler backend, and I thought that
LLVM might be a good choice since among mature compiler
infrastructures it's fairly young and presumably relatively clean.

You are at the right place. Its very clean as compared GCC and
Open64 for sure. I am quite new to LLVM but can already see the
clarity due to modular design.

I
will also need to choose an ISA and a functional simulator (which I
will also need to hack) for the evaluation. I'm not particularly
interested in micro-architecture level accuracy, so I'd rather avoid
that complexity if possible. I think I'd rather start with an ISA
more in the RISC family.
Does anyone have a suggestion about ISAs for which there is a good
LLVM backend and an open source/customizable functional simulator?

I agree that ARM is a good choice. MIPS is less well supported in LLVM,
I think.
But don't count out venerable X86. There are simulators available for
it, both open- and closed-source.

I have a similar problem. I need the performance based study so only options
seem to be "cycle-accurate" simulator or the real hardware. Due to the poor
performance of cycle-accurate simulators in terms of accuracy, I am not
considering currently.
Among the real hardware, X86 is a favorable choice due to wide
availability for me
to experiment with different configurations. However, its CISCity
scares me too.
Is it possible to choose a small subset of X86 ISA, without scary addressing
modes and emitting only those instructions from LLVM. Will generally available
hardware profiling tools work fine in such a case. Is there any info
available for
any such subset has been defined and used by someone already?

Note that things like instruction frequencies are highly ISA-dependent.
If possible, it is best to evaluate your ideas on more than one target,
just to see what the effects are. What other sorts of things do you
want to study?
If, long-term, you are planning to do serious studies of performance
impacts, I very highly recommend you not rely on simulators if at all
possible. I have never met a simulator ("cycle-accurate" or not) that
even comes close to giving reasonable performance predictions.
I admit this is rather difficult to do if you are exposing some new
hardware magic to the ISA. In cases like this I have long believed that
availability of compiler and simulator source code should be a bare
minimum prerequisite for publication. Unfortunately, I seem to be in a
rather small minority.
                            -Dave
My worries about the same comes from some studies like [1]

[1] http://www.csl.cornell.edu/~vince/papers/wddd08/wddd08_sim.pdf

Enjoy,
Ankur

ankur deshwal <a.s.deshwal@gmail.com> writes:

I have a similar problem. I need the performance based study so only
options seem to be "cycle-accurate" simulator or the real
hardware. Due to the poor performance of cycle-accurate simulators in
terms of accuracy, I am not considering currently.

Hooray! :slight_smile:

Among the real hardware, X86 is a favorable choice due to wide
availability for me to experiment with different
configurations. However, its CISCity scares me too.

CISC is your friend, do not fear the CISC. :slight_smile:

All jesting aside, x86 is a fairly regular/orthogonal ISA, even to a
fault, providing extra instructions that aren't really necessary simply
to maintain orthogonality. There are some nasty corner cases, of
course, but x86 gets a bad rap far beyond what is justified, IMHO.

Is it possible to choose a small subset of X86 ISA, without scary
addressing modes and emitting only those instructions from LLVM. Will
generally available hardware profiling tools work fine in such a
case. Is there any info available for any such subset has been defined
and used by someone already?

I think this is not a good way to go. Those addressing modes are a
major win for x86. If you don't use them, performance will suffer
badly. It is certainly possible to restrict what's used in LLVM but
since the support is already there, why not use them?

I believe several people have done studies that show only a small
percentage of the entire x86 ISA is actually used in real code. But
again, I would not worry about this kind of thing because LLVM already
supports all of it.

If you can describe what is scary and why (is it something to do with
ISA enhancements you're considering?), we might be able to help you out
a bit.

> If, long-term, you are planning to do serious studies of performance
> impacts, I very highly recommend you not rely on simulators if at
> all possible. I have never met a simulator ("cycle-accurate" or
> not) that even comes close to giving reasonable performance
> predictions.

My worries about the same comes from some studies like [1]

[1] http://www.csl.cornell.edu/~vince/papers/wddd08/wddd08_sim.pdf

Thanks! That's a must-read for any architecture or compiler researcher.
Architectures are so complex today that it is nearly impossible to model
all of the effects correctly. And on an architecture where a 5%
performance shift is very significant, that lack of accuracy is deadly.

Even validating a simulator is nearly impossible because the ~10
instruction snippets one uses to test the model against hardware are
going to behave compeltely differently in the context of the millions of
instructions around them in the real code.

I wish we had a good solution to this problem. :frowning:

                                 -Dave