[NOTE: This is MY opinion and not reflective of what people at Cray may
or may not think nor does it in any way imply a company-wide position.]
The is actually part of a larger problem within the computing research
community. Unlike the hard sciences, we don't require reproducibility
of results.
I would guess that this problem happens in other areas of research as well
(eg, just look at the sample sizes of some studies). And even if you can
reproduce sth, it might still not be a meaningful model of the real system.
This leads to endless hours of researchers fruitlessly trying
to improve upon results that they can't even verify in the first place.
The result is haphazard guesses and lower quality publications which
are of little use in the real world.
You can reproduce the implementation too, based on the description in the
paper. This kind of N-version programming can be useful but I agree that the
following comparison would require less effort if the original implementation
would be available.
Now, the ideas are the most valuable part of a publication, but it is
important to be able to validate the idea.
I agree. However, it's hard to enforce it. If one would require open-source
implementations + setup for every paper, there would be less or no papers by
industry research. But their input is valuable. And I guess you can't require
it just for academia (eg, some universities want to sell their research
results).
I disagree with the current
practice of not accepting papers that don't show a 10% improvement in
whatever area is under consideration. I believe we should also publish
papers that show negative results. This would save researchers an enormous
amount of time. Moreover, ideas that were wrong 20 years ago may very
well be right today.
I agree. Some communities/conferences also accept negative results, but I
think often the the quality requirements for these papers are much harder than
for the 10% papers. One issue with negative results though is how to determine
what is a negative result that is useful to report.
The combination of the current "10% or better" practice with no requirement
for reproducibility means there's very little incentive to release tools
and code used for the experiments. In fact there is disincentive, as we
wouldn't want some precocious student to demonstrate the experiment was
flawed. This is another problem, in that researchers view challenges as
personal threats rather than a chance to advance the state of the art and
students are encouraged to combatively challenge published research rather
than work with the original publishers to improve it.
I agree. What I wanted to point out in my previous mail is that even though
this is a change that would help, it's difficult to achieve it. I mean, you have
this problem _everywhere_. Even if the end result will be better, you need a
solution that can compete in the current system.
Torvald