What research to target?

(for pilot and beyond)

Our is quantitative work that informs global priorities (see linked discussion), especially in . We want to see better research leading to better outcomes in the real world (see our 'Theory of Change').

See (earlier) discussion in public call/EA forum discussion HERE.

To reach these goals, we need to select "the right research" for evaluation. We want to choose papers and projects that are highly relevant, methodologically promising, and that will benefit substantially from our evaluation work. We need to optimize how we select research so that our efforts remain mission-focused and useful. We also want to make our process transparent and fair. To do this, we are building a coherent set of criteria and goals, and a specific approach to guide this process. We explore several dimensions of these criteria below.

Management access only: General discussion of prioritization in Gdoc HERE. Private discussion of specific papers in Airtable and links (e.g., HERE). We incorporate some of this discussion below.

High-level considerations for prioritizing research

When considering a piece of research to decide whether to commission it to be evaluated, we can start by looking at its general relevance as well as the value of evaluating and rating it.

Our prioritization of a paper for evaluation should not be seen as an assessment of its quality, nor of its 'vulnerability'. Furthermore, specific and less intensive.

  1. Why is it relevant and worth engaging with?

We consider (and prioritize) the importance of the research to global priorities; its relevance to crucial decisions; the attention it is getting, the influence it is having; its direct relevance to the real world; and the potential value of the research for advancing other impactful work. We de-prioritize work that has already been credibly (publicly) . We also consider the fit of the research with our scope (social science, etc.), and the likelihood that we can commission experts to meaningfully evaluate it. As noted below, some 'instrumental goals' (sustainability, building credibility, driving change, ...) also play a role in our choices.

Some features we value, that might raise the probability we consider a paper or project include the commitment and contribution to open science, the authors' engagement with our process, and the logic, communication, and transparent reasoning of the work. However, if a prominent research paper is within our scope and seems to have a strong potential for impact, we will prioritize it highly, whether or not it has these qualities.

2. Why does it need (more) evaluation, and what are some key issues and claims to vet?

We ask the people who suggest particular research, and experts in the field:

  • What are (some of) the authors’ key/important claims that are worth evaluating?

  • What aspects of the evidence, argumentation, methods, and interpretation, are you unsure about?

  • What particular data, code, proofs, and arguments would you like to see vetted? If it has already been peer-reviewed in some way, why do you think more review is needed?

Ultimate goals: what are we trying to optimize?

Put broadly, we need to consider how this research allows us to achieve our own goals in line with our Global Priorities Theory of Change flowchart, The research we select and evaluate should meaningfully drive positive change. One way we might see this process: “better research & more informative evaluation” → “better decision-making” → “better outcomes” for humanity and for non-human animals (i.e., the survival and flourishing of life and human civilization and values).

Prioritizing research to achieve these goals

As we weigh research to prioritize for evaluation, we need to balance directly having a positive impact against building our ability to have an impact in the future.

A. Direct impact (‘score goals now’)

Below, we adapt the "ITN" cause prioritization framework (popular in effective altruism circles) to assess the direct impact of our evaluations.

Importance

What is the direct impact potential of the research?

This is a massive question many have tried to address (see sketches and links below). We respond to uncertainty around this question in several ways, including:

  • Consulting a range of sources, not only EA-linked sources.

  • Scoping what other sorts of work are representative inputs to GP-relevant work.

    • Get a selection of seminal GP publications; look back to see what they are citing and categorize by journal/field/keywords/etc.

Neglectedness

Where is the current journal system failing GP-relevant work the most . . . in ways we can address?

Tractability

  1. “Evaluability” of research: Where does the UJ approach yield the most insight or value of information?

  2. Existing expertise: Where do we have field expertise on the UJ team? This will help us commission stronger evaluations.

  3. "Feedback loops": Could this research influence concrete intervention choices? Does it predict near-term outcomes? If so, observing these choices and outcomes and getting feedback on the research and our evaluation can yield strong benefits.

Consideration/discussion: How much should we include research with indirect impact potential (theoretical, methodological, etc.)?

B. Sustainability: funding, support, participation

Moreover, we need to consider how the research evaluation might support the sustainability of The Unjournal and the broader general project of open evaluation. We may need to strike a balance between work informing the priorities of various audences, including:

  • Relevance to stakeholders and potential supporters

  • Clear connections to impact; measurability

  • Support from relevant academic communities

  • Support from open science

Consideration/discussion: What will drive further interest and funding?

C. Credibility, visibility, driving positive institutional change

Finally, we consider how our choices will increase the visibility and solidify the credibility of The Unjournal and open evaluations. We consider how our work may help drive positive institutional change. We aim to:

  • Interest and involve academics—and build the status of the project.

  • Commission evaluations that will be visibly useful and credible.

  • ‘Benchmark traditional publication outcomes’, track our predictiveness and impact.

  • Have strong leverage over research "outcomes and rewards."

  • Increase public visibility and raise public interest.

  • Bring in supporters and participants.

  • Achieve substantial output in a reasonable time frame and with reasonable expense.

  • Maintain goodwill and a justified reputation for being fair and impartial.

But some of these concerns may have trade offs

We are aware of possible pitfalls of some elements of our vision.

We are pursuing a second "high-impact policy and applied research" track for evaluation. This will consider work that is not targeted at academic audiences. This may have direct impact and please SFF funders, but, if not done carefully, this may distract us from changing academic systems, and may cost us status in academia.

A focus on topics perceived as niche (e.g., the economics and game theory of AI governance and AI safety) may bring a similar tradeoff.

On the other hand, perhaps a focus on behavioral and experimental economics would generate lots of academic interest and participants; this could help us benchmark our evaluations, etc.; but this may also be less directly impactful.

Giving managers autonomy and pushing forward quickly may bring the risk of perceived favoritism; a rule-based systematic approach to choosing papers to evaluate might be slower and less interesting for managers. However, it might be seen as fairer (and it might enable better measurement of our impact).

We hope we have identified the important considerations (above); but we may be missing key points. We continue to engage discussion and seek feedback, to hone and improve our processes and approaches.

Data: what are we evaluating/considering?

We present and analyze the specifics surrounding our current evaluation data in this interactive notebook/dashboard here.

Below: An earlier template for considering and discussing the relevance of research. This was/is provided both for our own consideration and for sharing (in part?) with evaluators, to give . Think of these as bespoke evaluation notes for a .

Proposed template

Title

  • One-click-link to paper

  • Link to any private hosted comments on the paper/project

Summary; why is this research relevant and worth engaging with?

As mentioned under High level considerations, consider factors including importance to global priorities, relevance to the field, the commitment and contribution to open science, the authors’ engagement, and the transparency of data and reasoning. You may consider the ITN framework explicitly, but not too rigidly.

Why does it need (more) review, and what are some key issues and claims to vet?

What are (some of) the authors’ main important claims that are worth carefully evaluating? What aspects of the evidence, argumentation, methods, interpretation, etc., are you unsure about? What particular data, code, proof, etc., would you like to see vetted? If it has already been peer-reviewed in some way, why do you think more review is needed?

What sort of reviewers should be sought, and what should they be asked?

What types of expertise and background would be most appropriate for the evaluation? Who would be interested? Please try to make specific suggestions.

How well has the author engaged with the process?

Do they need particular convincing? Do they need help making their engagement with The Unjournal successful?

Last updated

#536:

Change request updated