arrow-left

All pages
gitbookPowered by GitBook
1 of 1

Loading...

Promoting open and robust science

TLDR: Unjournal promotes research replicability/robustness

Unjournal evaluations aim to support the "Reproducibility/Robustness-Checking" (RRC) agenda. We are directly engaging with the Institute for Replicationarrow-up-right (I4R) and the repliCATS projectarrow-up-right (RC), and building connections to Replication Labarrow-up-right/TRELiSS and Metaculusarrow-up-right.

We will support this agenda by:

  1. Promoting data and code sharing: We request pre-print authors to share their code and data, and reward them for their transparency.

  2. : Breaking out of "PDF prisons" to achieve increased transparency.

  3. Encouraging detailed evaluations: Unjournal evaluators are asked to:

    • highlight the key/most relevant research claims, results, and tests;

    • propose possible robustness checks and tests (RRC work); and

  4. Implementing computational replication and robustness checking: We aim to work with I4R and other organizations to facilitate and evaluate computational replication and robustness checking.

  5. Advocating for open evaluation: We prioritize making the evaluation process transparent and accessible for all.

hashtag
Research credibility

While the in psychology is well known, economics is not immune. Some very prominent and influential work has , depends on is not, or uses . Roughly 40% of experimental economics work . Prominent commenters have that the traditional journal peer-review system does a poor job of spotting major errors and identifying robust work.

hashtag
Supporting the RRC agenda through Unjournal evaluations

My involvement with the shed light on a key challenge (see posts): The effectiveness of replication depends on the claims chosen for reproduction and how they are approached. I observed that it was common for the chosen claim to miss the essence of the paper, or to focus on a statistical result that, while likely to reproduce, didn't truly convey the author's message.

Simultaneously, I noticed that many papers had methodological flaws (for instance, lack of causal identification or the presence of important confounding factors in experiments). But I thought that these studies, if repeated, would likely yield similar results. These insights emerged from only a quick review of hundreds of papers and claims. This indicates that a more thorough reading and analysis could potentially identify the most impactful claims and elucidate the necessary RRC work.

Indeed, detailed, high-quality referee reports for economics journals frequently contain such suggestions. However, these valuable insights are often overlooked and rarely shared publicly. Unjournal aims to change this paradigm by focusing on three main strategies:

  1. Identifying vital claims for replication:

    • We plan to have Unjournal evaluators help highlight key "claims to replicate," along with proposing replication goals and methodologies. We will flag papers that particularly need replication in specific areas.

    • Public evaluation and author responses will provide additional insight, giving future replicators more than just the original published paper to work with.

By concentrating on NBER papers, we increase the likelihood of overlap with journals targeted by the Institute for Replication, thus enhancing the utility of our evaluations in aiding replication efforts.

chevron-rightOther mutual benefits/synergieshashtag

We can rely on and build a shared talent pool: UJ evaluators may be well-suited—and keen—to become robustness-reproducers (of these or other papers) as well as repliCATS participants.

We see the potential for synergy and economies of scale and scope in other areas, e.g., through:

  • sharing of IT/UX tools for capturing evaluator/replicator outcomes, and statistical or info.-theoretic tools for aggregating these outcomes;

chevron-rightBroader synergies in the medium termhashtag

As a "journal-independent evaluation" gains career value, as replication becomes more normalized, and as we scale up:

  • This changes incentive systems for academics, which makes rewarding replication/replicability easier than with the traditional journals’ system of "accept/reject, then start again elsewhere."

make predictions for these tests.

  • Encouraging author-assisted replication:

    • The Unjournal's platform and metrics, promoting dynamic documents and transparency, simplify the process of reproduction and replication.

    • By emphasizing replicability and transparency at the working-paper stage (Unjournal evaluations’ current focus), we make authors more amenable to facilitate replication work in later stages, such as post-traditional publication.

  • Predicting replicability and recognizing success:

    • We aim to ask Unjournal evaluators to make predictions about replicability. When these are successfully replicated, we can offer recognition. The same holds for repliCATS aggregated/IDEA group evaluations: To know if we are credibly assessing replicability, we need to compare these to at least some "replication outcomes."

    • The potential to compare these predictions to actual replication outcomes allows us to assess the credibility of our replicability evaluations. It may also motivate individuals to become Unjournal evaluators, attracted by the possibility of influencing replication efforts.

  • sharing of protocols for data, code, and instrument availability (e.g., Data and Code Availability Standardarrow-up-right);

  • communicating the synthesis of "evaluation and replication reports"; or

  • encouraging institutions, journals, funders, and working paper series to encourage or require engagement.

  • More ambitiously, we may jointly interface with prediction markets. We may also jointly integrate into platforms like OSF as part of an ongoing process of preregistration, research, evaluation, replication, and synthesis.

    The Unjournal could also evaluate I4rep replications, giving them status.
  • Public communication of Unjournal evaluations and responses may encourage demand for replication work.

  • In a general sense, we see cultural spillovers in the willingness to try new systems for reward and credibility, and for the gatekeepers to reward this behavior and not just the traditional "publication outcomes".

    Promoting 'Dynamic Documents' and 'Living Research Projects'
    replication crisisarrow-up-right
    blatant errorsarrow-up-right
    dubious econometric choices or faulty data,arrow-up-right
    robust to simple checksarrow-up-right
    likely-fraudulent dataarrow-up-right
    fail to replicatearrow-up-right
    arguedarrow-up-right
    SCORE replication market projectarrow-up-right
    Twitterarrow-up-right