"Pivotal questions" initiative
Last updated
Last updated
The Unjournal commissions public evaluations of impactful research in quantitative social sciences fields. We are seeking ‘pivotal questions’ to guide our choice of research papers to commission for evaluation. We are reaching out to organizations that aim to use evidence to do the most good, and asking: Which open questions most affect your policies and funding recommendations? For which questions would research yield the highest ‘value of information’?
Our main approach has been to search for papers and then commission experts to publicly evaluate them. (For more about our process, see here). Our field specialist teams search and monitor prominent research archives (like NBER), and consider agendas from impactful organizations, while keeping an eye on forums and social media. Our approach has largely been to look for research that seems relevant to impactful questions and crucial considerations. We're now exploring turning this on its head and identifying pivotal questions first and evaluating a cluster of research that informs these. This could offer a more efficient and observable path to impact. (See our ‘logic model’ flowchart for our theory of change for context.)
The Unjournal will ask impact-focused research-driven organizations such as GiveWell, Open Philanthropy, and Charity Entrepreneurship to identify specific . that impact their funding, policy, and research-direction choices. For example, if an organization is considering whether to fund a psychotherapeutic intervention in a LMIC, they might ask “How much does a brief course of non-specialist psychotherapy increase happiness, compared to the same amount spent on direct cash transfers?” We’re looking for the questions with the highest value-of-information (VOI) for the organization’s work over the next few years. We have some requirements — the questions should relate to The Unjournal’s coverage areas and engage rigorous research in economics, social science, policy, or impact quantification. Ideally, organizations will identify at least one piece of publicly-available research that relates to their question. But we are doing this mainly to help these organizations, so we will try to keep it simple and low-effort for them.
The Unjournal team will then discuss the suggested questions, leveraging our field specialists’ expertise. We’ll rank these questions, prioritizing at least one for each organization. We’ll work with the organization to specify the priority question precisely and in a useful way. We want to be sure that 1. evaluators will interpret these questions as intended, and 2. the answers that come out are likely to be actually helpful. We’ll make these lists of questions public and solicit general feedback — on the relevance of the questions, on their framing, on key sub-questions, and on pointers to relevant research.
Where practicable, we will operationalize the target questions as a claim on a prediction market (for example, Metaculus) to be resolved by the evaluations and synthesis below.
Where feasible, post these on public prediction markets (such as Metaculus)
If the question is well operationalized, and we have a clear approach to 'resolving it' after the evaluations and synthesis, we will post it on a reputation-based market like Metaculus or . Metaculus is offering 'minitaculus' platforms such as this one on Sudan to enable these more flexible questions.
We will ask (and help) the organizations and interested parties to specify their own beliefs about these questions, aka their 'priors'. We may adapt the Metaculus interface for this.
Once we’ve converged on the target question, we’ll do a variation of our usual evaluation process.
For each question we will prioritize roughly two to five . These papers may be suggested by the organization that suggested the question, sourced by The Unjournal, or discovered through community feedback ().
As we normally do, we’ll have ‘evaluation managers’ recruit . However, we’ll ask the evaluators to , and to consider the target organization’s priorities.
We’ll also . This is inspired by the repliCATS project, and some evidence suggesting that the (mechanistically aggregated) estimates of experts after deliberations than their independent estimates (also mechanistically aggregated). We may also facilitate collaborative evaluations and ‘live reviews’, following the examples of ASAPBio, PREreview, and others.
We will contact both the research authors (as per our standard process) and the target organizations for their responses to the evaluations, and for follow up questions. We’ll foster a productive discussion between them (while preserving anonymity as requested, and being careful not to overtax people’s time and generosity)
evaluation managers to write a report as a summary of the research investigated.
These reports should synthesize “What do the research, evaluations, and responses say about the question/claim?” They should provide an overall metric relating to the truth value of the target question (or similar for the parameter of interest). If and when we integrate prediction markets, they should decisively resolve the market claim.
Next, we will share these synthesis reports with authors and organizations for feedback.
We’ll put up each evaluation on our Unjournal.pubpub.org page, bringing them into academic search tools, databases, bibliometrics, etc. We’ll also curate them, linking them to the relevant target question and to the synthesis report..
We will produce, share, and promote further summaries of these packages. This could include forum and blog posts summarizing the results and insights, as well as interactive and visually appealing web pages. We might also produce less technical content, perhaps submitting work to outlets like Asterisk, Vox, or worksinprogress.co.
At least initially, we’re planning to ask for questions that could be definitively answered and/or measured quantitatively, and we will help organizations and other suggesters refine their questions to make this the case. These should approximately resemble questions that could be posted on forecasting platforms such as Manifold Markets or Metaculus. These should also somewhat resemble the 'claim identification' we currently request from evaluators.
We give detailed guidance with examples below:
Why do we want these pivotal questions to be 'operationalizable'?
We’re still refining this idea, and looking for your suggestions about what is unclear, what could go wrong, what might make this work better, what has been tried before, and where the biggest wins are likely to be. We’d appreciate your feedback! (Feel free to email contact@unjournal.org to make suggestions or arrange a discussion.)
If you work for an impact-focused research organization and you are interested in participating in our pilot, please reach out to us at contact@unjournal.org to flag your interest and/or complete this form. We would like to see:
A brief description of what your organization does (your ‘about us’ page is fine)
A specific, operationalized, high-value claim or research question you would like to be evaluated, that is within our scope (~quantitative social science, economics, policy, and impact measurement)
A brief explanation of why this question is particularly high value for your organization or your work, and how you have tried to answer it
If possible, a link to at least one research paper that relates to this question
Optionally, your current beliefs about this question (your ‘priors’)
Please also let us know how you would like to engage with us on refining this question and addressing it. Do you want to follow up with a 1-1 meeting? How much time are you willing to put in? Who, if anyone, should we reach out to at your organization?
Remember that we plan to make all of this analysis and evaluation public.
If you don’t represent an organization, we still welcome your suggestions, and will try to give feedback.
('.)
Please remember that we currently focus on quantitative ~social sciences fields, including economics, policy, and impact modeling (see here for more detail on our coverage). Questions surrounding (for example) technical AI safety, microbiology, or measuring animal sentience are less likely to be in our domain.
If you want to talk about this first, or if you have any questions, please send an email or schedule a meeting with David Reinstein, our co-founder and director.