Comment on page
What research to target?
(for pilot and beyond)
Our initial focus is quantitative work that informs global priorities (see linked discussion), especially in economics, policy, and social science. We want to see better research leading to better concrete outcomes that enable or accelerate positive change.
To reach these goals, we need to select "the right research" for evaluation. We want to choose papers and projects that are highly relevant, methodologically promising, and that will benefit substantially from our evaluation work. We need to optimize how we select research so that our efforts remain mission-focused and useful. We also want to make our process transparent and fair. To do this, we are building a coherent set of criteria and goals, and a specific approach to guide this process. These considerations have several dimensions, which we explore below.
When considering a piece of research to decide whether to commission it to be evaluated, we can start by looking at its general relevance as well as the value of evaluating and rating it.
Our prioritization of a paper for evaluation should not be seen as an assessment of its quality, nor of its 'vulnerability'. Furthermore, 'the prioritization is not the evaluation', it is less specific and less intensive.
- 1.Why is it relevant and worth engaging with?
We consider (and prioritize) the importance of the research to global priorities; its relevance to crucial decisions; the attention it is getting, the influence it is having; its direct relevance to the real world; and the potential value of the research for advancing other impactful work. We de-prioritize work that has already been credibly (publicly) evaluated. We also consider the research's fit with our scope (social science, etc.), and the likelihood that we can commission experts to meaningfully evaluate it. As noted below, we some of our 'instrumental goals' (sustainability, building credibility, driving change, ...) will play a role in our choices.
Some features we value, that might raise the probability we consider a paper or project: The commitment and contribution to open science. The authors' engagement with our process; and the logic, communication, and transparent reasoning of the work. However, if a prominent research paper is within our scope and seems to have a strong potential for impact, we will prioritize it highly, whether or not it has these features.
2. Why does it need (more) evaluation, and what are some key issues and claims to vet?
We ask the people who suggest particular research, and experts in the field:
- What are (some of) the authors’ key/important claims that are worth evaluating?
- What aspects of the evidence, argumentation, methods, and interpretation, are you unsure about?
- What particular data, code, proofs, and arguments would you like to see vetted? If it has already been peer-reviewed in some way, why do you think more review is needed?
Put broadly, we need to consider how this research allows us to achieve our own goals in line with our Global Priorities Theory of Change flowchart, targeting "ultimate outcomes." The research we select and evaluate should drive positive change in a meaningful way.
One way we might see this process: “better research & more informative evaluation” → “better decision-making” → “better outcomes” for humanity and for non-human animals (i.e., the survival and flourishing of life and human civilization and values).
We are still considering: Do we have other ultimate goals that are not in that chart, e.g., to improve research and knowledge-building for its own sake?
As we weigh candidate research to prioritize for evaluation, we need to balance directly having a positive impact against building our ability to have an impact in the future.
What is the direct impact potential of the research?
This is a massive question many have tried to address (see sketches and links). We respond to uncertainty around this question in several ways, including:
- Scoping what other sorts of work are representative inputs to GP-relevant work.
- Get a selection of seminal GP publications; look back to see what they are citing and categorize by journal/field/keywords/etc.
Where is the current journal system failing GP-relevant work the most . . . in ways we can address?
- 1.“Evaluability” of research: Where does the UJ approach yield the most insight or value of information?
- 2.Existing expertise: Where do we have field expertise on the UJ team? This will help us commission stronger evaluations.
- 3."Feedback loops": Could this research influence concrete intervention choices? Does it predict near-term outcomes? If so, observing these choices and outcomes and getting feedback on the research and our evaluation can yield strong benefits.
Consideration/discussion: How much should we include research with indirect impact potential (theoretical, methodological, etc.)?
Moreover, we need to consider how the research evaluation might support the sustainability of The Unjournal and the broader general project of open evaluation. We might want to strike a balance in terms of including cause areas of greatest interest from various quarters, considering...
- Relevance to stakeholders and potential supporters
- Clear connections to impact; measurability
- Support from relevant academic communities
- Support from open science
Consideration/discussion: What will drive further interest and funding?
Finally, we look to ways in which particular approaches can increase visibility and solidify the credibility of The Unjournal and open evaluations. We consider the extent to which our choices may help drive positive institutional change.
- Interest and involve academics—and build the status of the project.
- Commission evaluations that will be visibly useful and credible.
- ‘Benchmark traditional publication outcomes’, track our predictiveness and impact.
- Have strong leverage over research "outcomes and rewards."
- Increase public visibility and raise public interest.
- Bring in supporters and participants.
- Achieve substantial output in a reasonable time frame and with reasonable expense.
- Maintain goodwill and the justified reputation for being fair and impartial.
We are aware of possible pitfalls of some elements of our vision.
We are pursuing a second "non-academic, high-impact policy work" track for evaluation. This may have direct impact and please SFF funders, but, if not done carefully, this may distract us from changing academic systems, and may cost us status in academia.
A focus on topics perceived as niche (e.g., the economics and game theory of AI governance and AI safety) may bring a similar tradeoff.
On the other hand, perhaps a focus on behavioral and experimental economics would generate lots of academic interest and participants; this could help us benchmark our evaluations, etc.; but this may also be less directly impactful.
Giving managers autonomy and pushing forward quickly may bring the risk of perceived favoritism; a rule-based systematic approach to choosing papers to evaluate might be slower and less interesting for managers. However, it might be seen as fairer (and it might enable better measurement of our impact).
Overall, however, we feel we are aware of key issues around the consideration of research to be evaluated in The Unjournal and continue to hone and improve our process for approval.
Below, we present a (previous) template, both for our own consideration and for sharing (in part?) with evaluators, to give them some guidance. Think of these as bespoke evaluation notes for a "research overview, prioritization, and suggestions" document.
One-click-link Link to any private hosted comments on the paper/project
As mentioned above under High level considerations, consider factors including importance to global priorities, relevance to the field, the commitment and contribution to open science, the authors’ engagement, and the transparency of data and reasoning. You may consider the ITN framework explicitly, but not too rigidly.
What are (some of) the authors’ main important claims that are worth carefully evaluating? What aspects of the evidence, argumentation, methods, interpretation, etc., are you unsure about? What particular data, code, proof, etc., would you like to see vetted? If it has already been peer-reviewed in some way, why do you think more review is needed?
What types of expertise and background would be most appropriate for the evaluation? Who would be interested? Please try to make specific suggestions.
Do they need particular convincing? Do they need help making their engagement with The Unjournal successful?