Promoting open and robust science
TLDR: Unjournal promotes research replicability/robustness
Unjournal evaluations aim to support the "Reproducibility/Robustness-Checking" (RRC) agenda. We are directly engaging with the Institute for Replication (I4R) and the repliCATS project (RC), and building connections to Replication Lab/TRELiSS and Metaculus.
We will support this agenda by:
Promoting data and code sharing: We request pre-print authors to share their code and data, and reward them for their transparency.
Promoting 'Dynamic Documents' and 'Living Research Projects': Breaking out of "PDF prisons" to achieve increased transparency.
Encouraging detailed evaluations: Unjournal evaluators are asked to:
highlight the key/most relevant research claims, results, and tests;
propose possible robustness checks and tests (RRC work); and
make predictions for these tests.
Implementing computational replication and robustness checking: We aim to work with I4R and other organizations to facilitate and evaluate computational replication and robustness checking.
Advocating for open evaluation: We prioritize making the evaluation process transparent and accessible for all.
Research credibility
While the replication crisis in psychology is well known, economics is not immune. Some very prominent and influential work has blatant errors, depends on dubious econometric choices or faulty data, is not robust to simple checks, or uses likely-fraudulent data. Roughly 40% of experimental economics work fail to replicate. Prominent commenters have argued that the traditional journal peer-review system does a poor job of spotting major errors and identifying robust work.
Supporting the RRC agenda through Unjournal evaluations
My involvement with the SCORE replication market project shed light on a key challenge (see Twitter posts): The effectiveness of replication depends on the claims chosen for reproduction and how they are approached. I observed that it was common for the chosen claim to miss the essence of the paper, or to focus on a statistical result that, while likely to reproduce, didn't truly convey the author's message.
Simultaneously, I noticed that many papers had methodological flaws (for instance, lack of causal identification or the presence of important confounding factors in experiments). But I thought that these studies, if repeated, would likely yield similar results. These insights emerged from only a quick review of hundreds of papers and claims. This indicates that a more thorough reading and analysis could potentially identify the most impactful claims and elucidate the necessary RRC work.
Indeed, detailed, high-quality referee reports for economics journals frequently contain such suggestions. However, these valuable insights are often overlooked and rarely shared publicly. Unjournal aims to change this paradigm by focusing on three main strategies:
Identifying vital claims for replication:
We plan to have Unjournal evaluators help highlight key "claims to replicate," along with proposing replication goals and methodologies. We will flag papers that particularly need replication in specific areas.
Public evaluation and author responses will provide additional insight, giving future replicators more than just the original published paper to work with.
Encouraging author-assisted replication:
The Unjournal's platform and metrics, promoting dynamic documents and transparency, simplify the process of reproduction and replication.
By emphasizing replicability and transparency at the working-paper stage (Unjournal evaluations’ current focus), we make authors more amenable to facilitate replication work in later stages, such as post-traditional publication.
Predicting replicability and recognizing success:
We aim to ask Unjournal evaluators to make predictions about replicability. When these are successfully replicated, we can offer recognition. The same holds for repliCATS aggregated/IDEA group evaluations: To know if we are credibly assessing replicability, we need to compare these to at least some "replication outcomes."
The potential to compare these predictions to actual replication outcomes allows us to assess the credibility of our replicability evaluations. It may also motivate individuals to become Unjournal evaluators, attracted by the possibility of influencing replication efforts.
By concentrating on NBER papers, we increase the likelihood of overlap with journals targeted by the Institute for Replication, thus enhancing the utility of our evaluations in aiding replication efforts.
Last updated