1 of 100

The Unjournal: project and communication space

The Unjournal

The Unjournal is making research better by evaluating what really matters. We aim to make rigorous research more impactful and impactful research more rigorous.

Today's research evaluation process is out-of-date, it discourages innovation, and it encourages rent-seeking. We provide open, rigorous evaluation, focused on what's practically important to researchers, policy-makers, and the world. We make it easier for researchers to get feedback and credible ratings of their work.

We currently focus on quantitative work that informs global priorities, especially in economics, policy, and social science. Click on the cards below to find out more about our mission, organizational structure, and ways to collaborate, or 'ask or search our page' for answers to your questions.

You can also press ⌘k to search or query our site.

An Introduction to The Unjournal

We are not a journal!

Key links/FAQs

Content overview

A "curated guide" to this GitBook; updated June 2023

You can now ask questions of this GitBook using a chatbot: click the search bar or press cmd-k and choose "ask Gitbook."

Key sections and subsections

Independent evaluations (trial)

Disambiguation: The Unjournal focuses on commissioning expert evaluations, guided by an ‘evaluation manager’ and compensating people for their work. (See the outline of our main process here). We plan to continue to focus on that mode. Below we sketch an additional parallel but separate approach.

Note on other versions of this content.

Initiative: ‘independent evaluations’

is seeking academics, researchers, and students to submit structured evaluations of the most impactful research emerging in the social sciences. Strong evaluations will be posted or linked on our , offering readers a perspective on the implications, strengths, and limitations of the research. These evaluations can be submitted using for academic-targeted research or for more applied work; evaluators can publish their name or maintain anonymity; we also welcome collaborative evaluation work. We will facilitate, promote, and encourage these evaluations in several ways, described below.

Who should do these evaluations?

We are particularly looking for people with research training, experience, and expertise in quantitative social science and statistics including cost-benefit modeling and impact evaluation. This could include professors, other academic faculty, postdocs, researchers outside of academia, quantitative consultants and modelers, PhD students, and students aiming towards PhD-level work (pre-docs, research MSc students etc.) But anyone is welcome to give this a try — when in doubt, piease go for it.

We are also happy to support collaborations and group evaluations. There is a good track record for this — see: “”, ASAPBio’s, and for examples in this vein. We may also host live events and/or facilitate asynchronous collaboration on evaluations

Instructors/PhD, MRes, Predoc programs: We are also keen to work with students and professors to integrate ‘independent evaluation assignments’ (aka ‘learn to do peer reviews’) into research training.

Why should you do an evaluation?

Your work will support The Unjournal’s core mission — improving impactful research through journal-independent public evaluation. In addition, you’ll help research users (policymakers, funders, NGOs, fellow researchers) by providing high quality detailed evaluations that rate and discuss the strengths, limitations, and implications of research.

Doing an independent evaluation can also help you. We aim to provide feedback to help you become a better researcher and reviewer. We’ll also give prizes for the strongest evaluations. Lastly, writing evaluations will help you build a portfolio with The Unjournal, making it more likely we will commission you for paid evaluation work in the future.

Which research?

We focus on rigorous, globally-impactful research in quantitative social science and policy-relevant research. (See for details.) We’re especially eager to receive independent evaluations of:

Research we publicly prioritize: see our we've prioritized or evaluated. (Also...)
Research we previously evaluated (see , as well as )
Work that other people and organizations suggest as having high potential for impact/value of information (also see

You can also suggest research yourself and then do an independent evaluation of it.

What sort of ‘evaluations’ and what formats?

We’re looking for careful methodological/technical evaluations that focus on research credibility, impact, and usefulness. We want evaluators to dig into the weeds, particularly in areas where they have aptitude and expertise. See our.

The Unjournal’s structured evaluation forms: We encourage evaluators to do these using either:

Our : If you are evaluating research aimed at an academic journal or
Our : If you are evaluating research that is probably not aimed at an academic journal. This may include somewhat less technical work, such as reports from policy organizations and think tanks, or impact assessments and cost-benefit analyses

Other public evaluation platforms: We are also open to engaging with evaluations done on existing public evaluation platforms such as. Evaluators: If you prefer to use another platform, please let us know about your evaluation using one of the forms above. If you like, you can leave most of our fields blank, and provide a link to your evaluation on the other public platform.

Academic (~PhD) assignments and projects: We are also looking to build ties with research-intensive university programs; we can help you structure academic assignments and provide external reinforcement and feedback. Professors, instructors, and PhD students: please contact us ().

How will The Unjournal engage?

1. Posting and signal-boosting

We will encourage all these independent evaluations to be publicly hosted, and will share links to these. We will further promote the strongest independent evaluations, potentially re-hosting them on our platforms (such as )

However, when we host or link these, we will keep them clearly separated and signposted as distinct from the commissioned evaluations; independent evaluations will not be considered official, and their ratings won’t be included in our (see dashboard; see discussion).

2. Offering incentives

Bounties: We will offer prizes for the ‘most valuable independent evaluations’.

As a start, after the first eight quality submissions (or by Jan. 1 2025, whichever comes later), we will award a prize of $500 to the most valuable evaluation.

Further details tbd. As a reference...

All evaluation submissions will be eligible for these prizes and “grandfathered in” to any prizes announced later. We will announce and promote the prize winners (unless they opt for anonymity).

Evaluator pool: People who submit evaluations can elect to join our evaluator pool. We will consider and (time-permitting) internally rate these evaluations. People who do the strongest evaluations in our focal areas are likely to be commissioned as paid evaluators for The Unjournal.

We’re also moving towards a two-tiered base compensation for evaluations. We will offer a higher rate to people who can demonstrate previous strong review/evaluation work. These independent evaluations will count towards this ‘portfolio’.

3. Providing materials, resources and guidance/feedback

Our provides examples of strong work, including the.

We will curate guidelines and learning materials from relevant fields and from applied work and impact-evaluation. For a start, see We plan to build and curate more of this...

4. Partnering with academic institutions

We are reaching out to PhD programs and pre-PhD research-focused programs. Some curricula already involve “mock referee report” assignments. We hope professors will encourage their students to do these through our platform. In return, we’ll offer the incentives and promotion mentioned above, as well as resources, guidance, and some further feedback.

How does this benefit The Unjournal and our mission?

Crowdsourced feedback can add value in itself; encouraging this can enable some public evaluation and discussion of work that The Unjournal doesn’t have the bandwidth to cover
Improving our evaluator pool and evaluation standards in general.
1. Students and ECRs can practice and (if possible) get feedback on independent evaluations
2. They can

About The Unjournal (unfold)

commissions public evaluations of impactful research in quantitative social sciences fields. We are an alternative and a supplement to traditional academic peer-reviewed journals – separating evaluation from journals unlocks a . We ask expert evaluators to write detailed, constructive, critical reports. We also solicit a set of structured ratings focused on research credibility, methodology, careful and calibrated presentation of evidence, reasoning transparency, replicability, relevance to global priorities, and usefulness for practitioners (including funders, project directors, and policymakers who rely on this research).[2] While we have mainly targeted impactful research from academia, our covers impactful work that uses formal quantitative methods but is not aimed at academic journals. So far, we’ve commissioned about 50 evaluations of 24 papers, and published these evaluation packages , linked to academic search engines and bibliometrics.

Frequently Asked Questions (FAQ)

You can now ask questions of this GitBook using a chatbot! Click the search bar and choose 'ask gitbook'.

General FAQs

What is The Unjournal?

We organize and fund public, journal-independent feedback, rating, and evaluation of academic work. We focus on work that is highly relevant to global priorities, especially in economics, social science, and impact evaluation. We encourage better research by making it easier for researchers to get credible feedback. See for more details.

Does The Unjournal charge fees?

No. The Unjournal does not charge any fees. In fact, unlike most traditional journals, we compensate evaluators for their time, and award prizes for strong work.

We are a nonprofit organization. We do not charge fees for access to our evaluations, and work to make them as open as possible. In future, we may consider sliding-scale fees for people submitting their work for Unjournal evaluation. If so, this would simply be to cover our costs and compensate evaluators. We are a nonprofit and will stay that way.

Is The Unjournal a journal?

No. We do not publish research. We just commission public evaluation and rating of relevant research that is already publicly hosted. Having your work evaluated in The Unjournal should not limit you from submitting it to any publication.

How is The Unjournal funded? Is it sustainable?

We have from philanthropists and organizations who are interested in our priority research areas. We hope that our work will provide enough value to justify further direct funding. We may also seek funding from governments and universities supporting the open-access agenda.

I have another question.

Sure! Please contact us at .

Specific FAQs

For research authors

How does The Unjournal's evaluation process work?

We generally seek two evaluators (aka reviewers or referees) with research interests in your area and with complementary expertise. You, the author, can suggest areas that you want to receive feedback and suggestions on.

The evaluators write detailed and helpful evaluations, and submit them either "signed" or anonymously. They provide quantitative ratings on several dimensions, such as methods, relevance, and communication. They predict what journal tier the research will be published in, and what tier it should be published in. Here are the Guidelines for evaluators.

These evaluations and ratings are typically made public (see unjournal.pubpub.org), but you will have the right to respond before or after these are posted.

To consider your research we only need a link to a publicly hosted version of your work, ideally with a DOI. We will not "publish" your paper. The fact that we are handling your paper will not limit you in any way. You can submit it to any journal before, during, or after the process.

Can I ask for evaluations to be private?

You can request a conditional embargo by emailing us at , or via the submission form. Please explain what sort of embargo you are asking for, and why. By default, we would like Unjournal evaluations to be made public promptly. However, we may make exceptions in special circumstances, particularly for very early-career researchers.

If there is an early-career researcher on the authorship team, we may allow authors to "embargo" the publication of the evaluation until a later date. Evaluators (referees) will be informed of this. This date can be contingent, but it should not be indefinite. For example, we might grant an embargo that lasts until after a PhD/postdoc’s upcoming job market or until after publication in a mainstream journal, with a hard maximum of 14 months. (Of course, embargoes can be ended early at the request of the authors.)

In exceptional circumstances we may consider granting a "conditional indefinite embargo."

Do you make any requests of authors?

We may ask for some of the below, but these are mainly optional. We aim to make the process very light touch for authors.

A link to a non-paywalled, hosted version of your work (in any format) which can be given a DOI.
Responses to

We also allow you to respond to evaluations, and we give your response its own DOI.

Why should I submit my work to The Unjournal? Why should I engage with them?

The biggest personal gains for authors are:

Substantive feedback will help you improve your research. Substantive and useful feedback is often very hard to get, especially for young scholars. It's hard to get anyone to read your paper – we can help!
Ratings are markers of credibility for your work that could help your career. Remember, we .
The chance to publicly respond to criticism and correct misunderstandings.

Unjournal evaluations may particularly help you if...

Your work may be "under-published" because of:

Time pressure: Perhaps you, or a co-author were in a hurry and submitted it to a "safe" but low-ranked journal.

Interdisciplinary approaches: It may have been considered by a journal in one field, but it needs feedback and credibility from other fields (e.g., theory vs. empirics, etc.).

Improvements: You may have substantially improved and extended the work since its publication. The current journal system can only handle this if you claim it is a 'brand new paper'. We aim to fix this (see )

Limited journal opportunities: You may have "used up" the good journals in your field, or otherwise fallen through the cracks.

Is it risky to have my work evaluated in The Unjournal?

There are risks and rewards to any activity, of course. Here we consider some risks you may weigh against the benefits mentioned above.

Exclusivity
Public negative feedback

Exclusivity

Some traditional journals might have restrictions on the public sharing of your work, and perhaps they might enforce these more strongly if they fear competition from The Unjournal.

However, The Unjournal is not exclusive. Having your paper reviewed and evaluated in The Unjournal will not limit your options; you can still submit your work to traditional journals.

Public negative feedback

Our evaluations are public. While there has been some movement towards open review, this is still not standard. Typically when you submit your paper, reviews are private. With The Unjournal, you might get public negative evaluations.

We think this is an acceptable risk. Most academics expect that opinions will differ about a piece of work, and everyone has received negative reviews. Thus, getting public feedback — in The Unjournal or elsewhere — should not particularly harm you or your research project.

Nonetheless, we are planning some exceptions for early-career researchers (see ).

Unjournal evaluations should be seen as . Like all such signals, they are noisy. But submitting to The Unjournal shows you are confident in your work, and not afraid of public feedback.

How will these 'signals' be seen? (discussion)

Authors might worry that "a bad signal will hurt a lot, and a good signal will only help a little."

But if getting any public feedback was so damaging, why would academics be eager to present their work at seminars and conferences?

The main point: Unbiased signals cannot systematically lead to beliefs updating in one direction.

For fancy and formal ways of saying this and related concepts, see , , Rational Expectations, and the Law of Iterated Expectations.

In other words, people will take your transparency into account.

If a reviewer evaluates a paper without much information on how others rated it, they might suspect

I received the message "Your paper is being evaluated by The Unjournal". What does this mean?

Within our , The Unjournal directly chooses papers (from prominent archives, well-established researchers, etc.) to evaluate. We don't request authors' permission here.

As you can see in our , on this track, we engage with authors at (at least) two points:

Informing the authors that the evaluation will take place, requesting they engage, and giving them the opportunity to request a or specific types of feedback.
- Of particular interest: are we looking at the most recent version of the paper/project, or is there a further revised version we should be considering instead?
After the evaluations have been completed, the authors are given two weeks to respond, and have their response posted along with our 'package'. (Authors can also respond after we have posted the evaluations, and we will put their response in the same 'package', with a DOI etc.)

I received the message "The Unjournal can confirm we have received your submitted manuscript." What does this mean?

Once we receive unsolicited work from an author or authors, we keep it in our database and have our team decide on prioritization. If your paper is prioritized for evaluation, The Unjournal will notify you.

At present, we do not have a system to fully share the status of author submissions with authors. We hope to have one in place in the near future.

You can still submit your work to any traditional journal.

What if we have revised the paper since the version evaluated by The Unjournal?

The Unjournal aims to evaluate the most recent version of a paper. We reach out to authors to ensure we have the latest version at the start of the evaluation process.

If substantial updates are made to a paper during the evaluation process, authors are encouraged to share the updated version. We then inform our evaluators and ask if they wish to revise their comments.

If the evaluators can't or don't respond, we will note this and still link the newest version.

Authors are encouraged to respond to evaluations by highlighting major revisions made to their paper, especially those relevant to the evaluators' critiques. If authors are not ready to respond to evaluations, we can post a placeholder response indicating that responses and/or a revised version of the paper are forthcoming.

Re-evaluation: If authors and evaluators are willing to engage, The Unjournal is open to re-evaluating a revised version of a paper after publishing the evaluations of the initial version.

Why should I provide an "author's response" to evaluations?

We share evaluations with the authors and give them a chance to respond before we make the evaluations public (and again afterward, at any point). We add these to our . Evaluation manager's (public) reports and our further communications incorporate the paper, the evaluations, and the authors' responses.

Authors' responses could bring several benefits...

Personally: a chance to correct misconceptions and explain their approach and planned steps. If you spot any clear errors or omissions, we give evaluators a chance to adjust their reports in response. The authors response can also help others (including the evaluators, as well as journal referees and grant funders) to have a more accurate and positive view of the research
For research users, to get an informed balanced perspective on how to judge the work
For other researchers, to better understand the methodological issues and approaches. This can serve to start a public dialogue and discussion to help build the field and research agenda. Ultimately, we aim to facilitate a back-and-forth between authors, evaluators, and others.

What should an 'author response' look like?

Examples: We've received several detailed and informative author responses, such as:

Evaluations may raise substantive doubts and questions, and make some specific suggestions, and ask followup questions about (e.g.) data, context, or assumptions. There's no need to respond to every evaluator point; only you have something substantive: clarifying doubts, explaining the justification for your particular choices, and giving your thoughts on the suggestions (which will you incorporate, or not, and why?).

A well-written author response would (ideally) have a clear narrative and group responses into themes.

Try to have a positive tone (no personal attacks etc.) but avoid too much formality, politeness, or flattery. Revise-and-resubmit responses at standard journals sometimes begin each paragraph with "thank you for this excellent suggestion". Feel free to skip this; we want to focus on the substance.

Project submission, selection and prioritization

Submission/evaluation funnel

As we are paying evaluators and have limited funding, we cannot evaluate every paper and project. Papers enter our database through

submission by authors;

What research to target?

(for pilot and beyond)

Our initial focus is quantitative work that informs global priorities (see linked discussion), especially in economics, policy, and social science. We want to see better research leading to better outcomes in the real world (see our 'Theory of Change').

See (earlier) discussion in public call/EA forum discussion .

To reach these goals, we need to select "the right research" for evaluation. We want to choose papers and projects that are highly relevant, methodologically promising, and that will benefit substantially from our evaluation work. We need to optimize how we select research so that our efforts remain mission-focused and useful. We also want to make our process transparent and fair. To do this, we are building a coherent set of criteria and goals, and a specific approach to guide this process. We explore several dimensions of these criteria below.

Management access only: General discussion of prioritization in Gdoc . Private discussion of specific papers in Airtable and links (e.g., ). We incorporate some of this discussion below.

High-level considerations for prioritizing research

When considering a piece of research to decide whether to commission it to be evaluated, we can start by looking at its general relevance as well as the value of evaluating and rating it.

Our prioritization of a paper for evaluation should not be seen as an assessment of its quality, nor of its 'vulnerability'. Furthermore, 'the prioritization is not the evaluation', it is less specific and less intensive.

Why is it relevant and worth engaging with?

We consider (and prioritize) the importance of the research to global priorities; its relevance to crucial decisions; the attention it is getting, the influence it is having; its direct relevance to the real world; and the potential value of the research for advancing other impactful work. We de-prioritize work that has already been credibly (publicly) evaluated. We also consider the fit of the research with our scope (social science, etc.), and the likelihood that we can commission experts to meaningfully evaluate it. As noted , some 'instrumental goals' (, , driving change, ...) also play a role in our choices.

Some features we value, that might raise the probability we consider a paper or project include the commitment and contribution to open science, the authors' engagement with our process, and the logic, communication, and transparent reasoning of the work. However, if a prominent research paper is within our scope and seems to have a strong potential for impact, we will prioritize it highly, whether or not it has these qualities.

2. Why does it need (more) evaluation, and what are some key issues and claims to vet?

We ask the people who suggest particular research, and experts in the field:

What are (some of) the authors’ key/important claims that are worth evaluating?
What aspects of the evidence, argumentation, methods, and interpretation, are you unsure about?
What particular data, code, proofs, and arguments would you like to see vetted? If it has already been peer-reviewed in some way, why do you think more review is needed?

Ultimate goals: what are we trying to optimize?

Put broadly, we need to consider how this research allows us to achieve our own goals in line with our , targeting "ultimate outcomes." The research we select and evaluate should meaningfully drive positive change. One way we might see this process: “better research & more informative evaluation” → “better decision-making” → “better outcomes” for humanity and for non-human animals (i.e., the survival and flourishing of life and human civilization and values).

Prioritizing research to achieve these goals

As we weigh research to prioritize for evaluation, we need to balance directly having a positive impact against building our ability to have an impact in the future.

A. Direct impact (‘score goals now’)

Below, we adapt the (popular in effective altruism circles) to assess the direct impact of our evaluations.

Importance

What is the direct impact potential of the research?

This is a massive question many have tried to address (see sketches and links below). We respond to uncertainty around this question in several ways, including:

Consulting a range of sources, not only EA-linked sources.
- EA and more or less adjacent: and overviews, .
- Non-EA, e.g., .

Neglectedness

Where is the current journal system failing GP-relevant work the most . . . in ways we can address?

Tractability

“Evaluability” of research: Where does the UJ approach yield the most insight or value of information?
Existing expertise: Where do we have field expertise on the UJ team? This will help us commission stronger evaluations.
"Feedback loops": Could this research influence concrete intervention choices? Does it predict near-term outcomes? If so, observing these choices and outcomes and getting feedback on the research and our evaluation can yield strong benefits.

Consideration/discussion: How much should we include research with indirect impact potential (theoretical, methodological, etc.)?

B. Sustainability: funding, support, participation

Moreover, we need to consider how the research evaluation might support the sustainability of The Unjournal and the broader general project of open evaluation. We may need to strike a balance between work informing the priorities of various audences, including:

Relevance to stakeholders and potential supporters
Clear connections to impact; measurability
Support from relevant academic communities
Support from open science

Consideration/discussion: What will drive further interest and funding?

C. Credibility, visibility, driving positive institutional change

Finally, we consider how our choices will increase the visibility and solidify the credibility of The Unjournal and open evaluations. We consider how our work may help drive positive institutional change. We aim to:

Interest and involve academics—and build the status of the project.
Commission evaluations that will be visibly useful and credible.
‘Benchmark traditional publication outcomes’, track our predictiveness and impact.
Have strong leverage over research "outcomes and rewards."

But some of these concerns may have trade offs

We are aware of possible pitfalls of some elements of our vision.

We are pursuing a second "high-impact policy and applied research" track for evaluation. This will consider work that is not targeted at academic audiences. This may have direct impact and please SFF funders, but, if not done carefully, this may distract us from changing academic systems, and may cost us status in academia.

A focus on topics perceived as niche (e.g., the economics and game theory of AI governance and AI safety) may bring a similar tradeoff.

On the other hand, perhaps a focus on behavioral and experimental economics would generate lots of academic interest and participants; this could help us benchmark our evaluations, etc.; but this may also be less directly impactful.

Giving managers autonomy and pushing forward quickly may bring the risk of perceived favoritism; a rule-based systematic approach to choosing papers to evaluate might be slower and less interesting for managers. However, it might be seen as fairer (and it might enable better measurement of our impact).

We hope we have identified the important considerations (above); but we may be missing key points. We continue to engage discussion and seek feedback, to hone and improve our processes and approaches.

Data: what are we evaluating/considering?

We present and analyze the specifics surrounding our current evaluation data in

Below: An earlier template for considering and discussing the relevance of research. This was/is provided both for our own consideration and for sharing (in part?) with evaluators, to give them some guidance. Think of these as bespoke evaluation notes for a "research overview, prioritization, and suggestions" document.

Proposed template

Title

One-click-link to paper

What is global-priorities-relevant research?

On this page we link to and discuss some takes on answers to the questions, Which research is most impactful? Which research should be prioritized?

At The Unjournal, we are open to various approaches to the issues of "what is the most impactful research"? Perhaps looking at some of the research, we have already evaluated and research we are prioritizing (public link coming soon) will give you some insights. However, it seems fair that we should give at least one candidate description or definition.

A candidate description/definition; 'direct global impact of research'

"The direct global impact of a work of research is determined by the value of the information that it provides in helping individuals, governments, funders, and policymakers make better decisions. While research may not definitively answer key questions it should leave us more informed (in a Bayesian sense, 'more concentrated belief distributions') about these. We will measure the value of these 'better choices' in terms of the extent these improve the welfare of humans and other sentient beings, and the survival and flourishing of life and human civilization and values."

The above comes close to how some people on The Unjournal team think about research impact and prioritization, but we don't plan to adopt an official guiding definition.

Note the above definition is meant to exclude more basic research, which may also be high value, but which mainly serves as a building block for other research. In fact, The Unjournal does consider the value of research as an input into other research, particularly when it directly influences direct policy-relevant research, e.g., see .

It also excludes the value of "learning the truth" as an intrinsic good; we have tended not to make this a priority.

For more guidance on how we apply this, see our .

Others' takes on this question, resources...

Syllabi

Syllabi and course outlines that address global prioritization

EA-linked

Those listed below are at least somewhat tied to Effective Altruism.

Rhys-Bernard: reading syllabus -

Phil Trammel: "Topics in Economic Theory & Global Prioritization" -

" page in "Economics for EA and vice versa" Gitbook
Stafforini's list of EA syllabi

Other representative and relevant syllabi

(To be included here)

Funder/organization research agendas

We next consider organizations that take a broad focus on helping humans, animals, and the future of civilization. Some of these have explicitly set priorities and research agendas, although the level of specificity varies. Most of the organizations below have some connections to Effective Altruism; over time, we aim to also look beyond this EA focus.

(2020)

GPI research agenda – considerations

GPI focuses on prioritization research—what to prioritize and why; how to make these decisions. They focus less on how to implement improvements and interventions.

The agenda is divided into "The longtermism paradigm" and "General issues in global prioritisation."

The agenda focuses largely on formal theory (in philosophy, economics, and decision science) and, to a lesser extent., methodology. They aim to identify and inform "crucial considerations," and rarely focus on specific impact assessments.

Nonetheless, the agenda cites some empirical and directly policy-relevant work, and there are suggestions (e.g., from Eva Vivalt) that they might move more towards this in the future.

GPI research agenda – categories and content

Below, I (Reinstein) list the categories from GPI's 2020 agenda. I give a first-pass impression of the relevance of these categories for The Unjournal, in something like descending order (bold = most clearly relevant).

🎉

More relevant to The Unjournal:

Open Philanthropy

; posted on the EA Forum as ""

posted on the EA Forum as

is a fairly brief discussion and overview linking mostly to OP-funded research.

Other agendas and discussions

To be expanded, cataloged, and considered in more detail

("Research Priorities," 2021): A particularly well organized discussion. Each section has a list of relevant academic literature, some of which is recent and some of which is applied or empirical.

: Their "" and "Research briefs" are particularly helpful, and connect to a range of academic and policy research

Also consider

: simple discussions of the cause they prioritize, backed by numbers and links/citations

: Some directional suggestions in the "Our current plans" section under "Our research going forward is expected to focus on:"

: Not easy to link to research; they have a large number of priorities, goals, and principles; see infographic:

Their "" page considering relative cost-effectiveness; generally a shallow review/BOTEC spreadsheet approach. "CEARCH attempts to identify a cause’s marginal expected value (MEV)."

General advice

: This page is particularly detailed and contains a range of useful links to other agendas!
"

Psychology and well-being

(Gainsburg et al, 2021)
"
also

Economics: overviews and prioritization exercises

: The survey, as reported in the paper, does not suggest a particular agenda, but it does suggest a direction . . . economists would generally like to see more work in certain applied areas.
): This seems extremely relevant. . . . NSF

Reinstein's slides/outline of this field and opportunities

Action and progress

The steps we've taken and our plans; needs updating

This page and its sub-pages await updating

Gantt Chart of next steps

18 Jun 2023: This needs updating

Initial evaluations; feedback on the process
Revise process; further set of evaluations
Disseminate and communicate (research, evaluations, processes); get further public feedback
Further funding; prepare for scaling-up

Management: updates and CTA in gdoc shared in emails

Grants and proposals

Survival and Flourishing Fund (successful)

ACX/LTFF grant proposal (as submitted, successful)

Unsuccessful applications

For research authors

How does The Unjournal's evaluation process work?

These evaluations and ratings are typically made public (see unjournal.pubpub.org), but you will have the right to respond before or after these are posted.

Can I ask for evaluations to be private?

In exceptional circumstances we may consider granting a "conditional indefinite embargo."

Do you make any requests of authors?

We may ask for some of the below, but these are mainly optional. We aim to make the process very light touch for authors.

A link to a non-paywalled, hosted version of your work (in any format) which can be given a DOI.
Responses to

We also allow you to respond to evaluations, and we give your response its own DOI.

Why should I submit my work to The Unjournal? Why should I engage with them?

The biggest personal gains for authors are:

Substantive feedback will help you improve your research. Substantive and useful feedback is often very hard to get, especially for young scholars. It's hard to get anyone to read your paper – we can help!
Ratings are markers of credibility for your work that could help your career. Remember, we .
The chance to publicly respond to criticism and correct misunderstandings.

Unjournal evaluations may particularly help you if...

Your work may be "under-published" because of:

Time pressure: Perhaps you, or a co-author were in a hurry and submitted it to a "safe" but low-ranked journal.

Interdisciplinary approaches: It may have been considered by a journal in one field, but it needs feedback and credibility from other fields (e.g., theory vs. empirics, etc.).

Limited journal opportunities: You may have "used up" the good journals in your field, or otherwise fallen through the cracks.

Is it risky to have my work evaluated in The Unjournal?

There are risks and rewards to any activity, of course. Here we consider some risks you may weigh against the benefits mentioned above.

Exclusivity
Public negative feedback

Exclusivity

Some traditional journals might have restrictions on the public sharing of your work, and perhaps they might enforce these more strongly if they fear competition from The Unjournal.

However, The Unjournal is not exclusive. Having your paper reviewed and evaluated in The Unjournal will not limit your options; you can still submit your work to traditional journals.

Public negative feedback

Nonetheless, we are planning some exceptions for early-career researchers (see ).

Unjournal evaluations should be seen as . Like all such signals, they are noisy. But submitting to The Unjournal shows you are confident in your work, and not afraid of public feedback.

How will these 'signals' be seen? (discussion)

Authors might worry that "a bad signal will hurt a lot, and a good signal will only help a little."

But if getting any public feedback was so damaging, why would academics be eager to present their work at seminars and conferences?

The main point: Unbiased signals cannot systematically lead to beliefs updating in one direction.

For fancy and formal ways of saying this and related concepts, see , , Rational Expectations, and the Law of Iterated Expectations.

In other words, people will take your transparency into account.

If a reviewer evaluates a paper without much information on how others rated it, they might suspect

I received the message "Your paper is being evaluated by The Unjournal". What does this mean?

Within our , The Unjournal directly chooses papers (from prominent archives, well-established researchers, etc.) to evaluate. We don't request authors' permission here.

As you can see in our , on this track, we engage with authors at (at least) two points:

Informing the authors that the evaluation will take place, requesting they engage, and giving them the opportunity to request a or specific types of feedback.
- Of particular interest: are we looking at the most recent version of the paper/project, or is there a further revised version we should be considering instead?
After the evaluations have been completed, the authors are given two weeks to respond, and have their response posted along with our 'package'. (Authors can also respond after we have posted the evaluations, and we will put their response in the same 'package', with a DOI etc.)

I received the message "The Unjournal can confirm we have received your submitted manuscript." What does this mean?

At present, we do not have a system to fully share the status of author submissions with authors. We hope to have one in place in the near future.

You can still submit your work to any traditional journal.

What if we have revised the paper since the version evaluated by The Unjournal?

The Unjournal aims to evaluate the most recent version of a paper. We reach out to authors to ensure we have the latest version at the start of the evaluation process.

If the evaluators can't or don't respond, we will note this and still link the newest version.

Re-evaluation: If authors and evaluators are willing to engage, The Unjournal is open to re-evaluating a revised version of a paper after publishing the evaluations of the initial version.

Why should I provide an "author's response" to evaluations?

Authors' responses could bring several benefits...

Personally: a chance to correct misconceptions and explain their approach and planned steps. If you spot any clear errors or omissions, we give evaluators a chance to adjust their reports in response. The authors response can also help others (including the evaluators, as well as journal referees and grant funders) to have a more accurate and positive view of the research
For research users, to get an informed balanced perspective on how to judge the work
For other researchers, to better understand the methodological issues and approaches. This can serve to start a public dialogue and discussion to help build the field and research agenda. Ultimately, we aim to facilitate a back-and-forth between authors, evaluators, and others.

What should an 'author response' look like?

Examples: We've received several detailed and informative author responses, such as:

A well-written author response would (ideally) have a clear narrative and group responses into themes.

Previous updates

Progress notes since last update

"Progress notes": We will keep track of important developments here before we incorporate them into the official fortnightly "Update on recent progress." Members of the UJ team can add further updates here or in ; we will incorporate changes.

Update on recent progress: 21 July 2023

Funding

The SFF grant is now 'in our account' (all is public and made transparent on our ). This makes it possible for us to

move forward in filling staff and contractor positions (see below); and
increase evaluator compensation and incentives/rewards (see below).

We are circulating a sharing our news and plans.

Timelines, and pipelines

Our "Pilot Phase," involving ten papers and roughly 20 evaluations, is almost complete. We just released the evaluation package for .” We are now waiting on one last evaluation, followed by author responses and then "publishing" the final two packages at . (Remember: we publish the evaluations, responses and synthesis; we link the research being evaluated.)

We will make decisions and award our (and possible seminars) and evaluator prizes soon after. The winners will be determined by a consensus of our management team and advisory board (potentially consulting external expertise). The choices will be largely driven by the ratings and predictions given by Unjournal evaluators. After we make the choices, we will make our decision process public and transparent.

"What research should we prioritize for evaluation, and why?"

We continue to develop processes and policy around which research to prioritize. For example, we are considering whether we should set targets for different fields, for related outcome "cause categories," and for research sources. This discussion continues among our team and with stakeholders. We intend to open up the discussion further, making it public and bringing in a range of voices. The objective is to develop a framework and a systematic process to make these decisions. See our expanding notes and discussion on

In the meantime, we are moving forward with our post-pilot “pipeline” of research evaluation. Our management team is considering recent prominent and influential working papers from the National Bureau of Economics Research () and beyond, and we continue to solicit submissions, suggestions, and feedback. We are also reaching out to users of this research (such as NGOs, charity evaluators, and applied research think tanks), asking them to identify research they particularly rely on and are curious about. If you want to join this conversation, we welcome your input.

(Paid) Research opportunity: to help us do this

We are also considering hiring a small number of researchers to each do a one-off (~16 hours) project in “research scoping for evaluation management.” The project is sketched at ; essentially, summarizing a research theme and its relevance, identifying potentially high-value papers in this area, choosing one paper, and curating it for potential Unjournal evaluation.

We see a lot of value in this task and expect to actually use and credit this work.

If you are interested in applying to do this paid project, please let us know .

Call for "Field Specialists"

Of course, we can't commission the evaluation of every piece of research under the sun (at least not until we get the next grant :) ). Thus, within each area, we need to find the right people to monitor and select the strongest work with the greatest potential for impact, and where Unjournal evaluations can add the most value.

This is a big task and there is a lot of ground to cover. To divide and conquer, we’re partitioning this space (looking at natural divisions between fields, outcomes/causes, and research sources) amongst our management team as well as among what we now call...

" (FSs), who will

focus on a particular area of research, policy, or impactful outcome;
keep track of new or under-considered research with potential for impact;
explain and assess the extent to which The Unjournal can add value by commissioning this research to be evaluated; and

Field specialists will usually also be members of our Advisory Board, and we are encouraging expressions of interest for both together. (However, these don’t need to be linked in every case.) .

Interested in a field specialist role or other involvement in this process? Please fill out (about 3–5 minutes).

Setting priorities for evaluators

We are also considering how to set priorities for our evaluators. Should they prioritize:

Giving feedback to authors?
Helping policymakers assess and use the work?
Providing a 'career-relevant benchmark' to improve research processes?

We discuss this topic , considering how each choice relates to our .

Increase in evaluator compensation, incentives/rewards

We want to attract the strongest researchers to evaluate work for The Unjournal, and we want to encourage them to do careful, in-depth, useful work. for (on-time, complete) evaluations to $400, and we are setting aside $150 per evaluation for incentives, rewards, and prizes. Details on this to come.

Please consider signing up for our evaluator pool (fill out ).

Adjacent initiatives and 'mapping this space'

As part of The Unjournal’s general approach, we keep track of (and keep in contact with) other initiatives in open science, open access, robustness and transparency, and encouraging impactful research. We want to be coordinated. We want to partner with other initiatives and tools where there is overlap, and clearly explain where (and why) we differentiate from other efforts. gives a preliminary breakdown of similar and partially-overlapping initiatives, and tries to catalog the similarities and differences to give a picture of who is doing what, and in what fields.

Also to report

New

, Professor of Economics, UC Santa Barbara
, Associate Researcher, INRAE, Member, Toulouse School of Economics (animal welfare agenda)
, Associate Professor, expert judgment, biosciences, applied probability, uncertainty quantification

Tech and platforms

We're working with PubPub to improve our process and interfaces. We plan to take on a to help us work with them closely as they build their platform to be more attractive and useful for The Unjournal and other users.

Our hiring, contracting, and expansion continues

Our next hiring focus: . We are looking for a strong writer who is comfortable communicating with academics and researchers (particularly in economics, social science, and policy), journalists, policymakers, and philanthropists. Project-based.
We've chosen (and are in the process of contracting) a strong quantitative meta-scientist and open science advocate for the project: “Aggregation of expert opinion, forecasting, incentives, meta-science.” (Announcement coming soon.)
We are also expanding our Management Committee and Advisory Board; see .

Potentially relevant events in the outside world

Update on recent progress: 1 June 2023

Update from David Reinstein, Founder and Co-Director

A path to change

With the , we now have the opportunity to move forward and really make a difference. I think The Unjournal, along with related initiatives in other fields, should become the place policymakers, grant-makers, and researchers go to consider whether research is reliable and useful. It should be a serious option for researchers looking to get their work evaluated. But how can we start to have a real impact?

Over the next 18 months, we aim to:

Build awareness: (Relevant) people and organizations should know what The Unjournal is.
Build credibility: The Unjournal must consistently produce insightful, well-informed, and meaningful evaluations and perform effective curation and aggregation of these. The quality of our work should be substantiated and recognized.
Expand our scale and scope: We aim to grow significantly while maintaining the highest standards of quality and credibility. Our loose target is to evaluate around 70 papers and projects over the next 18 months while also producing other valuable outputs and metrics.

I sketch these goals , along with our theory of change, specific steps and approaches we are considering, and some "wish-list wins." Please free to add your comments and questions.

The pipeline flows on

While we wait for the new grant funding to come in, we are not sitting on our haunches. Our "pilot phase" is nearing completion. Two more sets of evaluations have been posted on our .

With three more evaluations already in progress, this will yield a total of 10 evaluated papers. Once these are completed, we will decide, announce, and award the recipients for the and the prizes for evaluators, and organize online presentations/discussions (maybe linked to an "award ceremony"?).

Contracting, hiring, expansion

No official announcements yet. However, we expect to be hiring (on a part-time contract basis) soon. This may include roles for:

Researchers/meta-scientists: to help find and characterize research to be evaluated, identify and communicate with expert evaluators, and synthesize our "evaluation output"
Communications specialists
Administrative and Operations personnel
Tech support/software developers

of these roles. And to indicate your potential interest and link your CV/webpage.

You can also/alternately register your interest in doing (paid) research evaluation work for The Unjournal, and/or in being part of our advisory board, .

We also plan to expand our ; please reach out if you are interested or can recommend suitable candidates.

Tech and initiatives

We are committed to enhancing our platforms as well as our evaluation and communication templates. We're also exploring strategies to nurture more beneficial evaluations and predictions, potentially in tandem with replication initiatives. A small win: our Mailchimp signup should now be working, and this update should be automatically integrated.

Welcoming new team members

We are delighted to welcome (FAS) and (INRA/TSE) to our , and (Monk Prayogshala) to our !

Dworkin's work centers on "improving scientific research, funding, institutions, and incentive structures through experimentation."
Treich's current research agenda largely focuses on the intersection of animal welfare and economics.
Tagat investigates economic decision-making in the Indian context, measuring the social and economic impact of the internet and technology, and a range of other topics in applied economics and behavioral science. He is also in the .

Update on recent progress: 6 May 2023

Grant funding from the Survival and Flourishing Fund

The Unjournal was through the 'S-Process' of the Survival and Flourishing Fund. More details and plans to come. This grant will help enable The Unjournal to expand, innovate, and professionalize. We aim to build the awareness, credibility, scale, and scope of The Unjournal, and the communication, benchmarking, and useful outputs of our work. We want to have a substantial impact, building towards our mission and goals...

To make rigorous research more impactful, and impactful research more rigorous. To foster substantial, credible public evaluation and rating of impactful research, driving change in research in academia and beyond, and informing and influencing policy and philanthropic decisions.

Innovations: We are considering other initiatives and refinements (1) to our evaluation ratings, metrics, and predictions, and how these are aggregated, (2) to foster open science and robustness-replication, and (3) to provide inputs to evidence-based policy decision-making under uncertainty. Stay tuned, and please join the conversation.

Opportunities: We plan to expand our management and advisory board, increase incentives for evaluators and authors, and build our pool of evaluators and participating authors and institutions. Our previous call-to-action (see ) is still relevant if you want to sign up to be part of our evaluation (referee) pool, submit your work for evaluation, etc. (We are likely to put out a further call soon, but all responses will be integrated.)

Evaluation 'output'

We have published a total of 12 evaluations and ratings of five papers and projects, as well as three author responses. Four can be found on our PubPub page (most concise list ), and one on our Sciety page (we aim to mirror all content on both pages). All the PubPub content has a DOI, and we are working to get these indexed on Google Scholar and beyond.

The two most recently released evaluations (of Haushofer et al, 2020; and Barker et al, 2022) both surround "" [link: EA Forum post]

Both papers consider randomized controlled trials (RCTs) involving cognitive behavioral therapy (CBT) for low-income households in two African countries (Kenya and Ghana). These papers come to very different conclusions as to the efficacy of this intervention.

See the evaluation summaries and ratings, with linked evaluations and

Update on recent progress: 22 April 2023

New 'output'

We are now up to twelve total evaluations of five papers. Most of these are on our (we are currently aiming to have all of the work hosted both at PubPub and on Sciety, and gaining DOIs and entering the bibliometric ecosystem). The latest two are on an interesting theme, :

Two more Unjournal Evaluation sets are out. Both papers consider randomized controlled trials (RCTs) involving cognitive behavioral therapy (CBT) for low-income households in two African countries (Kenya and Ghana). These papers come to very different conclusions as to the efficacy of this intervention.

These are part of Unjournal's .

More evaluations coming out soon on themes including global health and development, the environment, governance, and social media.

Animal welfare

To round out our initial pilot: We're particularly looking to evaluate papers and projects relevant to animal welfare and animal agriculture. Please reach out if you have suggestions.

New features of this GitBook: GPT-powered 'chat' Q&A

You can now 'chat' with this page, ask questions, and get answers with links to other parts of the page. To try it out, go to "Search" and choose "Lens."

Update on recent progress: 17 Mar 2023

See our latest post on the EA Forum

Our new platform (), enabling DOIs and CrossRef (bibliometrics)
"self-correcting science"
More evaluations soon
We are pursuing collaborations with replication and robustness initiatives such as the

Update on recent progress: 19 Feb 2023

Content and 'publishing'

Our is up...
With our ("Long Term Cost-Effectiveness of Resilient Foods"... Denkenberger et al. Evaluations from Scott Janzwood, Anca Hanea, and Alex Bates, and an author response.
Two more evaluations 'will be posted soon' (waiting for final author responses.

Tip of the Spear ... right now we are:

Working on getting six further papers (projects) evaluated, most of which are part of our NBER
Developing and discussing tools for aggregating and presenting the evaluators' quantitative judgments
Building our platforms, and considering ways to better format and integrate evaluations

Funding, plans, collaborations

We are seeking grant funding for our continued operation and expansion (see below). We're appealing to funders interested in Open Science and in impactful research.

We're considering collaborations with other compatible initiatives, including...

replication/reproducibility/robustness-checking initiatives,
prediction and replication markets,
and projects involving the elicitation and 'aggregation of expert and stakeholder beliefs' (about both replication and outcomes themselves).

Management and administration, deadlines

We are now under the 'fiscal sponsorship' (this does not entail funding, only a legal and administrative home). We are postponing the deadline for judging the and the prizes for evaluators. Submission of papers and the processing of these has been somewhat slower than expected.

Other news and media

EA Forum: "recent post and AMA (answering questions about the Unjournal's progress, plans, and relation to effective-altruism-relevant research
March 9-10: David Reinstein will present at the , session on "Translating Open Science Best Practices to Non-academic Settings". See . David will discuss The Unjournal for part of this session.

Calls to action

See: . These are basically still all relevant.

Evaluators: We have a strong pool of evaluators.

Howev=er, atm we are particularly seeking evaluators:

with quantitative backgrounds, especially in economics, policy, and social-science
comfortable with statistics, cost-effectiveness, impact evaluation, and or Fermi Montecarlo models,

Recall, we pay at least $250 per evaluation, we typically pay more in net ($350), and we are looking to increase this compensation further. Please fill out (about 3-5 min) if you are interested

Research to evaluate/prizes: We continue to be interested in submitted and suggested work. One area we would like to engage with more: quantitative social science and economics work relevant to animal welfare.

Hope these updates are helpful. Let me know if you have suggestions.

Guidelines for evaluators

This page describes The Unjournal's evaluation guidelines, considering our priorities and criteria, the metrics we ask for, and how these are considered.

30 July 2024: These guidelines below apply to the evaluation form currently hosted on PubPub. We're adjusting this form somewhat – the new form is temporarily hosted in Coda here (academic stream) and here (applied stream). If you prefer, you are welcome to use the Coda form instead (just let us know).

If you have any doubts about which form to complete or about what we are looking for please ask the evaluation manager or email contact@unjournal.org.

You can download a pdf version of these guidelines here (updated March 2024).

Please see for an overview of the evaluation process, as well as details on compensation, public recognition, and more.

What we'd like you to do

Write an evaluation of the target paper or project, similar to a standard, high-quality referee report. Please identify the paper's main claims and carefully assess their validity, leveraging your own background and expertise.
Give quantitative metrics and predictions as described below.

Writing the evaluation (aka 'the review')

In writing your evaluation and providing ratings, please consider the following.

The Unjournal's expectations and criteria

In many ways, the written part of the evaluation should be similar to a report an academic would write for a traditional high-prestige journal (e.g., see some 'conventional guidelines' ). Most fundamentally, we want you to use your expertise to critically assess the main claims made by the authors. Are the claims well-supported? Are the assumptions believable? Are the methods are appropriate and well-executed? Explain why or why not.

However, we'd also like you to pay some consideration to our priorities:

Advancing our knowledge and practice
Justification, reasonableness, validity, and robustness of methods
Logic and communication
Open, communicative, replicable science

See our for more details on each of these. Please don't structure your review according to these metrics, just pay some attention to them.

Specific requests for focus or feedback

Please pay attention to anything our managers and editors specifically asked you to focus on. We may ask you to focus on specific areas of expertise. We may also forward specific feedback requests from authors.

The evaluation will be made public

Unless you were advised otherwise, this evaluation, including the review and quantitative metrics, will be given a DOI and, hopefully, will enter the public research conversation. Authors will be given two weeks to respond to reviews before the evaluations, ratings, and responses are made public. You can choose whether you want to be identified publicly as an author of the evaluation.

If you have questions about the authors’ work, you can ask them anonymously: we will facilitate this.

We want you to evaluate the most recent/relevant version of the paper/project that you can access. If you see a more recent version than the one we shared with you, please let us know.

Publishing evaluations: considerations and exceptions

We may give early-career researchers the right to veto the publication of very negative evaluations or to embargo the release of these for a defined period. We will inform you in advance if this will be the case for the work you are evaluating.

You can reserve some "sensitive" content in your report to be shared with only The Unjournal management or only the authors, but we hope to keep this limited.

Target audiences

We designed this process to balance three considerations with three target audiences. Please consider each of these:

Crafting evaluations and ratings that help researchers and policymakers judge when and how to rely on this research. For Research Users.
Ensuring these evaluations of the papers are comparable to current journal tier metrics, to enable them to be used to determine career advancement and research funding. For Departments, Research Managers, and Funders.
Providing constructive feedback to Authors.

We discuss this, and how it relates to our impact and "theory of change", .

"But isn't The Unjournal mainly just about feedback to authors"?

We accept that in the near-term an Unjournal evaluation may not be seen to have substantial career value.

Furthermore, work we are considering may tend be at an earlier stage. authors may submit work to us, thinking of this as a "pre-journal" step. The papers we select (e.g., from NBER) may also have been posted long before authors planned to submit them to journals.

This may make the 'feedback for authors' and 'assessment for research users' aspects more important, relative to traditional journals' role. However, in the medium-term, a positive Unjournal evaluation should gain credibility and career value. This should make our evaluations an "endpoint" for a research paper.

Quantitative metrics

We ask for a set of nine quantitative metrics. For each metric, we ask for a score and a 90% credible interval. We describe these in detail below. (We explain .)

Percentile rankings

For some questions, we ask for a percentile ranking from 0-100%. This represents "what proportion of papers in the reference group are worse than this paper, by this criterion". A score of 100% means this is essentially the best paper in the reference group. 0% is the worst paper. A score of 50% means this is the median paper; i.e., half of all papers in the reference group do this better, and half do this worse, and so on.

Here* the population of papers should be all serious research in the same area that you have encountered in the last three years.

*Unless this work is in our 'applied and policy stream', in which case...

For the applied and policy stream the reference group should be "all applied and policy research you have read that is aiming at a similar audience, and that has similar goals".

"Serious" research? Academic research?

Here, we are mainly considering research done by professional researchers with high levels of training, experience, and familiarity with recent practice, who have time and resources to devote months or years to each such research project or paper. These will typically be written as 'working papers' and presented at academic seminars before being submitted to standard academic journals. Although no credential is required, this typically includes people with PhD degrees (or upper-level PhD students). Most of this sort of research is done by full-time academics (professors, post-docs, academic staff, etc.) with a substantial research remit, as well as research staff at think tanks and research institutions (but there may be important exceptions).

What counts as the "same area"?

This is a judgment call. Here are some criteria to consider: first, does the work come from the same academic field and research subfield, and does it address questions that might be addressed using similar methods? Secondly, does it deal with the same substantive research question, or a closely related one? If the research you are evaluating is in a very niche topic, the comparison reference group should be expanded to consider work in other areas.

"Research that you have encountered"

We are aiming for comparability across evaluators. If you suspect you are particularly exposed to higher-quality work in this category, compared to other likely evaluators, you may want to adjust your reference group downwards. (And of course vice-versa, if you suspect you are particularly exposed to lower-quality work.)

Midpoint rating and credible intervals

For each metric, we ask you to provide a 'midpoint rating' and a 90% credible interval as a measure of your uncertainty. Our interface provides slider bars to express your chosen intervals:

for more guidance on uncertainty, credible intervals, and the midpoint rating as the 'median of your belief distribution'.

The table below summarizes the percentile rankings.

Quantitative metric

Scale

Overall assessment

Percentile ranking (0-100%)

Judge the quality of the research heuristically. Consider all aspects of quality, credibility, importance to knowledge production, and importance to practice.

Claims, strength and characterization of evidence **

Do the authors do a good job of (i) stating their main questions and claims, (ii) providing strong evidence and powerful approaches to inform these, and (iii) correctly characterizing the nature of their evidence?

Methods: Justification, reasonableness, validity, robustness

Percentile ranking (0-100%)

Are the methods used well-justified and explained; are they a reasonable approach to answering the question(s) in this context? Are the underlying assumptions reasonable?

Are the results and methods likely to be robust to reasonable changes in the underlying assumptions? Does the author demonstrate this?

Avoiding bias and (QRP): Did the authors take steps to reduce bias from opportunistic reporting and QRP? For example, did they do a strong pre-registration and pre-analysis plan, incorporate multiple hypothesis testing corrections, and report flexible specifications?

Advancing our knowledge and practice

Percentile ranking (0-100%)

To what extent does the project contribute to the field or to practice, particularly in ways that are relevant to global priorities and impactful interventions?

(Applied stream: please focus on ‘improvements that are actually helpful’.)

Less weight to "originality and cleverness’"

Originality and cleverness should be weighted less than the typical journal, because The Unjournal focuses on impact. Papers that apply existing techniques and frameworks more rigorously than previous work or apply them to new areas in ways that provide practical insights for GP (global priorities) and interventions should be highly valued. More weight should be placed on 'contribution to GP' than on 'contribution to the academic field'.

Do the paper's insights inform our beliefs about important parameters and about the effectiveness of interventions?

Does the project add useful value to other impactful research?

We don't require surprising results; sound and well-presented null results can also be valuable.

Logic and communication

Percentile ranking (0-100%)

Are the goals and questions of the paper clearly expressed? Are concepts clearly defined and referenced?

Is the reasoning "transparent"? Are assumptions made explicit? Are all logical steps clear and correct? Does the writing make the argument easy to follow?

Are the conclusions consistent with the evidence (or formal proofs) presented? Do the authors accurately state the nature of their evidence, and the extent it supports their main claims?

Are the data and/or analysis presented relevant to the arguments made? Are the tables, graphs, and diagrams easy to understand in the context of the narrative (e.g., no major errors in labeling)?

Open, collaborative, replicable research

Percentile ranking (0-100%)

This covers several considerations:

Replicability, reproducibility, data integrity

Would another researcher be able to perform the same analysis and get the same results? Are the methods explained clearly and in enough detail to enable easy and credible replication? For example, are all analyses and statistical tests explained, and is code provided?

Is the source of the data clear?

Is the data made as available as is reasonably possible? If so, is it clearly labeled and explained??

Consistency

Do the numbers in the paper and/or code output make sense? Are they internally consistent throughout the paper?

Useful building blocks

Do the authors provide tools, resources, data, and outputs that might enable or enhance future work and meta-analysis?

Relevance to global priorities, usefulness for practitioners**

Are the paper’s chosen topic and approach likely to be useful to

Does the paper consider real-world relevance and deal with policy and implementation questions? Are the setup, assumptions, and focus realistic?

Do the authors report results that are relevant to practitioners? Do they provide useful quantified estimates (costs, benefits, etc.) enabling practical impact quantification and prioritization?

Do they communicate (at least in the abstract or introduction) in ways policymakers and decision-makers can understand, without misleading or oversimplifying?

Earlier category: "Real-world relevance"

Real-world relevance

Percentile ranking (0-100%)

Are the assumptions and setup realistic and relevant to the real world?

Earlier category: Relevance to global priorities

Percentile ranking (0-100%)

Could the paper's topic and approach potentially help inform

Journal ranking tiers

Note: this is less relevant for work in our Applied Stream

Most work in our will not be targeting academic journals. Still, in some cases it might make sense to make this comparison; e.g., if particular aspects of the work might be rewritten and submitted to academic journals, or if the work uses certain techniques that might be directly compared to academic work. If you believe a comparison makes sense, please consider giving an assessment below, making reference to our guidelines and how you are interpreting them in this case.

To help universities and policymakers make sense of our evaluations, we want to benchmark them against how research is currently judged. So, we would like you to assess the paper in terms of journal rankings. We ask for two assessments:

a normative judgment about 'how well the research should publish';
a prediction about where the research will be published.

Journal ranking tiers are on a 0-5 scale, as follows:

0/5: "Won't publish/little to no value". Unlikely to be cited by credible researchers
1/5: OK/Somewhat valuable journal
2/5: Marginal B-journal/Decent field journal

We give some example journal rankings , based on SJR and ABS ratings.

We encourage you to consider a non-integer score, e.g. 4.6 or 2.2.

As before, we ask for a 90% credible interval.

Journal ranking tiers

Scale

90% CI

PubPub note: as of 14 March 2024, the PubPub form is not allowing you to give non-integer responses. Until this is fixed, please multiply these by 10 and enter these using the 0-50 slider . (Or use the Coda form.)

What journal ranking tier should this work be published in?

Journal ranking tier (0.0-5.0)

Assess this paper on the journal ranking scale described above, considering only its merit, giving some weight to the category metrics we discussed above.

Equivalently, where would this paper be published if:

the journal process was fair, unbiased, and free of noise, and that status, social connections, and lobbying to get the paper published didn’t matter;
journals assessed research according to the category metrics we discussed above.

What journal ranking tier will this work be published in?

Journal ranking tier (0.0-5.0)

What if this work has already been peer reviewed and published?

If this work has already been published, and you know where, please report the prediction you would have given absent that knowledge.

The midpoint and 'credible intervals': expressing uncertainty

What are we looking for and why?

We want policymakers, researchers, funders, and managers to be able to use The Unjournal's evaluations to update their beliefs and make better decisions. To do this well, they need to weigh multiple evaluations against each other and other sources of information. Evaluators may feel confident about their rating for one category, but less confident in another area. How much weight should readers give to each? In this context, it is useful to quantify the uncertainty.

But it's hard to quantify statements like "very certain" or "somewhat uncertain" – different people may use the same phrases to mean different things. That's why we're asking for you a more precise measure, your credible intervals. These metrics are particularly useful for meta-science and meta-analysis.

You are asked to give a 'midpoint' and a 90% credible interval. Consider this as the smallest interval that you believe is 90% likely to contain the true value. See the fold below for further guidance.

How do I come up with these intervals? (Discussion and guidance)

You may understand the concepts of uncertainty and credible intervals, but you might be unfamiliar with applying them in a situation like this one.

You may have a certain best guess for the "Methods..." criterion. Still, even an expert can never be certain. E.g., you may misunderstand some aspect of the paper, there may be a method you are not familiar with, etc.

Your uncertainty over this could be described by some distribution, representing your beliefs about the true value of this criterion. Your "'best guess" should be the central mass point of this distribution.

You are also asked to give a 90% credible interval. Consider this as

Consider the midpoint as the 'median of your belief distribution'

We also ask for the 'midpoint', the center dot on that slider. Essentially, we are asking for the median of your belief distribution. By this we mean the percentile ranking such that you believe "there's a 50% chance that the paper's true rank is higher than this, and a 50% chance that it actually ranks lower than this."

Get better at this by 'calibrating your judgment'

If you are "", your 90% credible intervals should contain the true value 90% of the time. To understand this better, assess your ability, and then practice to get better at estimating your confidence in results. will help you get practice at calibrating your judgments. We suggest you choose the "Calibrate your Judgment" tool, and select the "confidence intervals" exercise, choosing 90% confidence. Even a 10 or 20 minute practice session can help, and it's pretty fun.

Claim identification, assessment, and implications

We are now asking evaluators for “claim identification and assessment” where relevant. This is meant to help practitioners use this research to inform their funding, policymaking, and other decisions. It is not intended as a metric to judge the research quality per se. This is not required but we will reward this work.

Survey questions

Lastly, we ask evaluators about their background, and for feedback about the process.

Survey questions for evaluators: details

For the two questions below, we will publish your responses unless you specifically ask these questions to be kept anonymous.

How long have you been in this field?

Other guidelines and notes

Note on the evaluation platform (13 Feb 2024)

12 Feb 2024: We are moving to a hosted form/interface in PubPub. That form is still somewhat a work-in-progress, and may need some further guidance; we try to provide this below, but please contact us with any questions. If you prefer, you can also submit your response in a Google Doc, and share it back with us. Click to make a new copy of that directly.

Length/time spent: This is up to you. We welcome detail, elaboration, and technical discussion.

Length and time: possible benchmarks

recommends a 2–3 page referee report; suggest this is relatively short, but confirm that brevity is desirable. , economists report spending (median and mean) about one day per report, with substantial shares reporting "half a day" and "two days." We expect that reviewers tend spend more time on papers for high-status journals, and when reviewing work that is closely tied to their own agenda.

Adjustments to earlier metrics; earlier evaluation forms

We have made some adjustments to this page and to our guidelines and processes; this is particularly relevant for considering earlier evaluations. See .

If you still have questions, please contact us, or see our FAQ on .

Our data protection statement is linked .

"Pivotal questions" initiative

The Pivotal Questions project in brief

The Unjournal commissions public evaluations of impactful research in quantitative social sciences fields. We are seeking ‘pivotal questions’ to guide our choice of research papers to commission for evaluation. We are reaching out to organizations that aim to use evidence to do the most good, and asking: Which open questions most affect your policies and funding recommendations? For which questions would research yield the highest ‘value of information’?

Our main approach has been to search for papers and then commission experts to publicly evaluate them. (For more about our process, see here). Our field specialist teams search and monitor prominent research archives (like NBER), and consider agendas from impactful organizations, while keeping an eye on forums and social media. Our approach has largely been to look for research that seems relevant to impactful questions and crucial considerations. We're now exploring turning this on its head and identifying pivotal questions first and evaluating a cluster of research that informs these. This could offer a more efficient and observable path to impact. (See our for context.)

The process

Elicit questions

The Unjournal will ask impact-focused research-driven organizations such as GiveWell, Open Philanthropy, and Charity Entrepreneurship to identify specific quantifiable questions^[We may later expand this to somewhat more open-ended and general questions; see below.] that impact their funding, policy, and research-direction choices. For example, if an organization is considering whether to fund a psychotherapeutic intervention in a LMIC, they might ask “How much does a brief course of non-specialist psychotherapy increase happiness, compared to the same amount spent on direct cash transfers?” We’re looking for the questions with the highest value-of-information (VOI) for the organization’s work over the next few years. We have some requirements — the questions should relate to The Unjournal’s coverage areas and engage rigorous research in economics, social science, policy, or impact quantification. Ideally, organizations will identify at least one piece of publicly-available research that relates to their question. But we are doing this mainly to help these organizations, so we will try to keep it simple and low-effort for them.

Select, refine, and get feedback on the target questions

The Unjournal team will then discuss the suggested questions, leveraging our field specialists’ expertise. We’ll rank these questions, prioritizing at least one for each organization. We’ll work with the organization to specify the priority question precisely and in a useful way. We want to be sure that 1. evaluators will interpret these questions as intended, and 2. the answers that come out are likely to be actually helpful. We’ll make these lists of questions public and solicit general feedback — on the relevance of the questions, on their framing, on key sub-questions, and on pointers to relevant research.

Where practicable, we will operationalize the target questions as a claim on a prediction market (for example, Metaculus) to be resolved by the evaluations and synthesis below.

Where feasible, post these on public prediction markets (such as Metaculus)

If the question is well operationalized, and we have a clear approach to 'resolving it' after the evaluations and synthesis, we will post it on a reputation-based market like or . Metaculus is offering 'minitaculus' platforms such as to enable these more flexible questions.

Elicit stakeholder beliefs

We will ask (and help) the organizations and interested parties to specify their own beliefs about these questions, aka their 'priors'. We may adapt the Metaculus interface for this.

Source and prioritize research informing the target questions

Once we’ve converged on the target question, we’ll do a variation of our usual evaluation process.

For each question we will prioritize roughly two to five relevant research papers. These papers may be suggested by the organization that suggested the question, sourced by The Unjournal, or discovered through community feedback (see note).

Commission expert evaluations of research, informing the target questions

As we normally do, we’ll have ‘evaluation managers’ recruit expert evaluators to assess each paper. However, we’ll ask the evaluators to focus on the target question, and to consider the target organization’s priorities.

We’ll also enable phased deliberation and discussion among evaluators. This is inspired by the, and some evidence suggesting that the (mechanistically aggregated) estimates of experts after deliberations perform better than their independent estimates (also mechanistically aggregated). We may also facilitate collaborative evaluations and ‘live reviews’, following the examples of , , and others.

Get feedback from paper authors and from the target organization(s)

We will contact both the research authors (as per our standard process) and the target organizations for their responses to the evaluations, and for follow up questions. We’ll foster a productive discussion between them (while preserving anonymity as requested, and being careful not to overtax people’s time and generosity)

Prepare a “Synthesis Report”

We’ll commission one or more evaluation managers to write a report as a summary of the research investigated.

These reports should synthesize “What do the research, evaluations, and responses say about the question/claim?” They should provide an overall metric relating to the truth value of the target question (or similar for the parameter of interest). If and when we integrate prediction markets, they should decisively resolve the market claim.

Next, we will share these synthesis reports with authors and organizations for feedback.

(Where applicable) Resolve the prediction markets

Complete and publish the ‘target question evaluation packages’

We’ll put up each evaluation on our page, bringing them into academic search tools, databases, bibliometrics, etc. We’ll also curate them, linking them to the relevant target question and to the synthesis report..

We will produce, share, and promote further summaries of these packages. This could include forum and blog posts summarizing the results and insights, as well as interactive and visually appealing web pages. We might also produce less technical content, perhaps submitting work to outlets like, , or .

‘Operationalizable’ questions

At least initially, we’re planning to ask for questions that could be definitively answered and/or measured quantitatively, and we will help organizations and other suggesters refine their questions to make this the case. These should approximately resemble questions that could be posted on forecasting platforms such as or . These should also somewhat resemble the we currently request from evaluators.

We give detailed guidance with examples below:

Why do we want these pivotal questions to be 'operationalizable'?

How you can help us

Give us feedback on this proposal

We’re still refining this idea, and looking for your suggestions about what is unclear, what could go wrong, what might make this work better, what has been tried before, and where the biggest wins are likely to be. We’d appreciate your feedback! (Feel free to email to make suggestions or arrange a discussion.)

Suggest organizations and people we should reach out to

Suggest target questions

If you work for an impact-focused research organization and you are interested in participating in our pilot, please reach out to us at contact@unjournal.org to flag your interest and/or complete . We would like to see:

A brief description of what your organization does (your ‘about us’ page is fine)
A specific, , high-value claim or research question you would like to be evaluated, that is within our scope (~quantitative social science, economics, policy, and impact measurement)
A brief explanation of why this question is particularly high value for your organization or your work, and how you have tried to answer it
If possible, a link to at least one research paper that relates to this question

Please also let us know how you would like to engage with us on refining this question and addressing it. Do you want to follow up with a 1-1 meeting? How much time are you willing to put in? Who, if anyone, should we reach out to at your organization?

Remember that we plan to make all of this analysis and evaluation public.

If you don’t represent an organization, we still welcome your suggestions, and will try to give feedback.

(Note on 'bounties'.)

Please remember that we currently focus on quantitative ~social sciences fields, including economics, policy, and impact modeling (see for more detail on our coverage). Questions surrounding (for example) technical AI safety, microbiology, or measuring animal sentience are less likely to be in our domain.

If you want to talk about this first, or if you have any questions, please send an email or with David Reinstein, our co-founder and director.

ACX/LTFF grant proposal (as submitted, successful)

Passed on to LTFF and funding was awarded

acx unjournal app

frozen version as Dropbox paper here

Passed on to LTFF and funding was awarded
Start date = ~21 February 2022

Description and plan

Short one-sentence description of your proposed project

The "Unjournal" will organize and fund 'public journal-independent evaluation’ of EA-relevant/adjacent research, encouraging this research by making it easier for academics and EA-organization researchers to get feedback and credible ratings.

Longer description of your proposed project

The case, the basic idea

Peer review is great, but academic publication processes are wasteful, slow, rent-extracting, and they discourage innovation. From

Academic publishers extract rents and discourage progress. But there is a coordination problem in ‘escaping’ this. Funders like Open Philanthropy and EA-affiliated researchers are not stuck, we can facilitate an exit.
The traditional binary ‘publish or reject’ system wastes resources (wasted effort and gamesmanship) and adds unnecessary risk. I propose an alternative, the “Evaluated Project Repo”: a system of credible evaluations, ratings, and published reviews (linked to an open research archive/curation). This will also enable more readable, reliable, and replicable research formats, such as dynamic documents; and allow research projects to continue to improve without “paper bloat”. (I also propose some ‘escape bridges’ from the current system.)
Global priorities and EA research organizations are looking for ‘feedback and quality control’, dissemination, and external credibility. We would gain substantial benefits from supporting, and working with the Evaluated Project Repo (or with related peer-evaluation systems), rather than (only) submitting our work to traditional journals. We should also put some direct value on results of open science and open access, and the strong impact we may have in supporting this.

I am asking for funding to help replace this system, with EA 'taking the lead'. My goal is permanent and openly-hosted research projects, and efficient journal-independent peer review, evaluation, and communication. (I have been discussing and presenting this idea publicly for roughly one year, and gained a great deal of feedback. I return to this in the next section).

The twelve-month plan

I propose the following 12-month Proof of Concept: Proposal for EA-aligned research 'unjournal' collaboration mechanis

Build a ‘founding committee’ of 5-8 experienced and enthusiastic EA-aligned/adjacent researchers at EA orgs, research academics, and practitioners (e.g., draw from speakers at recent EA Global meetings).

Update 1 Aug 2022, mainly DONE, todo: consult EAG speakers

I will publicly share my procedure for choosing this group (in the long run we will aim at transparent and impartial process for choosing ‘editors and managers’, as well as aiming at decentralized forms of evaluation and filtering.)

2. Host a meeting (and shared collaboration space/document), to come to a consensus/set of principles on

A cluster of EA-relevant research areas we want to start with
A simple outreach strategy
How we determine which work is 'EA-interesting’
How we will choose ‘reviewers’ and avoid conflicts-of-interest

Update 1 Aug 2022: 2 meetings so far, agreed on on going-forward policies for most of the above

3. Post and present our consensus (on various fora especially in the EA, Open Science, and relevant academic communities, as well as pro-active interviews with key players). Solicit feedback. Have a brief ‘followup period’ (1 week) to consider adjusting the above consensus plan in light of the feedback.

Update 1 Aug 2022: Done somewhat; waiting to have 2+ papers assessed before we engage more

4. Set up the basic platforms, links

Note: I am strongly leaning towards as the main platform, which has indicated willingness to give us a flexible ‘experimental spac\

Update 1 Aug 2022: Going with Kotahi and Sciety as a start; partially setup

5. Reach out to researchers in relevant areas and organizations and ask them to 'submit' their work for 'feedback and potential positive evaluations and recognition', and for a chance at a prize.

The unjournal will *not be an exclusive outlet.* Researchers are free to also submit the same work to 'traditional journals' at any point.
Their work must be publicly hosted, with a DOI. Ideally the 'whole project' is maintained and updated, with all materials, in a single location. We can help enable them to host their work and enable DOI's through (e.g.) Zenodo; even hosted 'dynamic documents' can be DOI'd.

Update 1 Aug 2022: Did a 'bounty' and some searching of our own, plan a 'big public call' afrter pilot evaluations of 2+ papers

Researchers are encouraged to write and present work in 'reasoning transparent' (as well as 'open science' transparent) ways. They are encouraged to make connections with core EA ideas and frameworks, but without being too heavy-handed. Essentially, we are asking them to connect their research to 'the present and future welfare of humanity/sentient beings'.

Reviews will, by default, be made public and connected with the paper. However, our committee will discuss I. whether/when authors are allowed to withdraw/hide their work, and II. when reviews will be ‘signed’ vs anonymous. In my conversations with researchers, some have been reluctant to ‘put themselves out there for public criticism’, while others seem more OK with this. We aim to have roughly 25 research papers/projects reviewed/evaluated and 'communicated' (to EA audiences) in the first year.

Update July 2022: scaled back to 15 papers

Suggested policies (DR)

My suggestions on the above, as a starting point...

Given my own background, I would lean towards ‘empirical social science’ (including Economics) and impact evaluation and measurement (especially for ‘effective charitable interventions’)
Administration should be light-touch, to also be attractive to aligned academics
We should build "editorial-board-like" teams with subject/area expertise

Key drivers of success/failure, suggested responses

Laying these out; I have responses to some of these, others will require further consideration \

Encouraging participation

Will researchers find it useful to submit/share their work? From my experience (i) as an academic economist and (ii) working at Rethink Priorities, and my conversations with peers, I think people would find this very useful. I would have (and still would).

i. FEEDBACK IS GOLD: It is very difficult to get anyone to actually read your paper, and to get actual useful feedback on your work. The incentive is to publish, not to read, papers are dense and require specific knowledge, and people may be reluctant to criticize peers, and economists tend to be free-riders. It is hard to engage seminar audiences on the more detailed aspects of the work, and then one gets feedback on the ‘presentation’ not the ‘paper’. We often use ‘submission to journal’ as a way to get feedback, but this is slow, not the intended use of the journal process (I’ve been told), and often results in less-useful feedback. (A common perception is that the referee ‘decides what decision to make and then fleshes out a report to justify it.)

ii. ACADEMICS NEED SOURCES OF TIMELY VALIDATION: The publication process is extremely slow and complicated in Economics (and other fields, in my experience), requiring years of submissions and responses to multiple journals. This imposes a lot of risky for an academic’s career, particularly pre-tenure. Having an additional credible source validating the strength of one’s work could help reduce this risk. If we do this right, I think hiring and tenure committees would consider it as an important source of quality information.

iii. EA ORGS/FUNDERS need both, but the traditional journal process is costly in time and hassle. I think researchers and research managers at RP would be very happy to get feedback through this, as well as an assessment of the quality of their work, and suggestions for alternative methods and approaches. We would also benefit from external signals of the quality of our work, in justifying this to funders such as Open Philanthropy. (OP themselves would value this greatly, I believe. They are developing their own systems for asse_s_sing the quality of their funded work, but I expect they would prefer an external source.) However, it is costly for us at RP to submit to academic journals: the process is slow and bureaucratic and noisy, and traditional journals will typically not evaluate work with EA priorities and frameworks in mind. (Note that I suggest the unjournal make these priorities a factor while also assessing the work’s rigor in ways that invoke justifiable concerns in academic disciplines.)

I assume that similar concerns apply to other EA research organizations.

iv. OPEN SCIENCE AND DYNAMIC FORMATS

IMO the best and most transparent way to present data-driven work (as well as much quantitative work) is in a dynamic document, where narrative, code, and results are presented in concert. Readers can ‘unfold for further details’. The precise reasoning, data, and generation of each result can be traced. These can also be updated and improved with time. Many researchers, particularly those involved in Open Science, find this the most attractive way to work and present their work. However, ‘frozen pdf prison’ and ‘use our bureaucratic system’ approaches makes this very difficult to use in traditional journals. As the ‘unjournal’ does not host papers, but merely assesses work with DOI’s (which can be, e.g. a hosted web page, as frozen at a particular point in time of review), we can facilitate this. Will researchers find it ‘safe’ to share their work?

A large group of Economists and academics tend to be conservative, risk-averse, and leader-following. But there are important exceptions and also substantial groups that seek to be particular innovative and iconoclastic.

The key concerns we will need to address (at least for some researchers). i. Will my work be ‘trashed publicly in a way that hurts my reputation’? I think this is more for early-career; more experienced researchers will have a thicker skin and realize that it’s common-knowledge that some people disagree with their approaches. ii. Will this tag me as ‘weird or non-academic’. This might be addressed by our making connections to academic bodies and established researchers. How to get quality reviews and avoid slacking/free-riding by reviewers? Ideas:

compensation and rewarding quality as an incentive,
recruiting reviewers who seem to have intrinsic motivations,
publishing some ‘signed’ reviews (but there are tradeoffs here as we want to avoid flattery)
longer run: integrated system of ‘rating the reviews’, a la StackExchange (I know there are some innovations in process here we’d love to link with

How to make the evaluations credible? Will they be valued?

QUANTIFY and CALIBRATE

We will ask referees to give a set of quantitative ratings in addition to their detailed feedback and discussion. These should be stated in ways that are made explicitly relative to other work they have seen, both within the Unjournal, and in general. Referees might be encouraged to ‘calibrate’; first given a set of (previously traditionally-published) papers to rank and rate. They should be later reminded about how the distribution of the evaluation they have given.

Within our system, evaluations themselves could be stated ‘relative to the other evaluations given by the same referee.’

BENCHMARK We also will encourage or require referees provide a ‘a predicted/equivalent “traditional publication outcome” and possibly incentivize these predictions. (And we could consider running public prediction markets on this in the longer run, as has been done in other contexts). This should be systematized. It could be stated as “this project is of the sufficient quality that it has a 25% probability of being published in a journal of the rated quality of Nature, and a 50% probability of being published in a journal such as the Journal of Public Economics or better … within the next 3 years.” (We can also elicit statements about the impact factor, etc.)

I expect most/many academics who submit their work will also submit it to traditional journals at least in the first year or so of this project. (but ultimately we hope this 'unjournal' system of journal-independent evaluation provides a signal of quality that will supercede The Musty Old Journal.) This will thus provide us a way to validate the above predictions, as well as independently establish a connection between our ratings and the ‘traditional’ outcomes. PRIZE as a powerful signal/scarce commodity The “prize for best submissions” (perhaps a graded monetary prize for the top 5 submissions in the first year) will provide a commitment device and a credible signal, to enhance the attractiveness and prestige of this.

We may try to harness and encourage additional tools for quality assessment, considering cross-links to prediction markets/Metaculus, the coin-based 'ResearchHub', etc.

Will the evaluations be valued by gatekeepers (universities, grantmakers, etc.) and policy-makers? This will ultimately depend on the credibility factors mentioned above. I expect they will have value to EA and open-science-oriented grantmakers fairly soon, especially if the publicly-posted reviews are of a high apparent quality.

I expect academia to take longer to come on board. In the medium run they are likely to value it as ‘a factor in career decisions’ (but not as much as a traditional journal publication); particularly if our Unjournal finds participation and partnership with credible established organizations and prominent researchers.

I am optimistic because of my impression that non-traditional-journal outcomes (arXiv and impact factors, conference papers, cross-journal outlets, distill.pub) are becoming the source of value in several important disciplines How will we choose referees? How to avoid conflicts of interest (and the perception of this)?

This is an important issue. I believe there are ‘pretty good’ established protocols for this. I’d like to build specific prescribed rules for doing this, and make it transparent. We may be able to leverage tools, e.g., those involving GPT3 like elicit.org.

Other crucial issues

COI: We should partition the space of potential researchers and reviewers, and/or establish ‘distance measures’ (which may themselves be reported along with the review). There should be specified rules, e.g., ‘no one from the same organization or an organization that is partnering with the author’s organization’. Ideally EA-orgresearchers’ work should be reviewed by academic researchers, and to some extent vice-versa.

How to support EA ideas, frameworks, and priorities while maintaining (actual and perceived) objectivity and academic rigor

(Needs discussion)

Why hasn’t this been done before? I believe it involves a collective action problem, as well as a coordination/lock-in problem that can be solved by bringing together the compatible interests of two groups. Academic researchers have expertise, credibility, but they are locked into traditional and inefficient systems. EA organizations/researchers have a direct interest in feedback and fostering this research, and have some funding and are not locked into traditional systems.

Yonatan Cale restating my claim:

Every Econ researcher (interested in publishing) pays a price for having the system set up badly, the price isn't high enough for any one researcher to have an incentive to fix the system for themselves, but as a group, they would be very happy if someone would fix this systematic problem (and they would in theory be willing to "pay" for it, because the price of "fixing the system" is way lower then the sum of the prices that each one of them pays individually)

‘Sustainability’ … Who will pay for these reviews in the longer run

Once this catches on…Universities will pay to support this; they will save massively on journal subscriptions. Governments supporting Open Science will fund this. Authors/research orgs will pay a reasonable submission fee to partly/fully cover the cost of the reviews. EA-aligned research funders will support this.

But we need to show a proof-of-concept and build credibility. The ACX grant funds can help make this happen.

Describe why you think you're qualified to work on this

My CV should make this clear\

I have been an academic economist for 15-20 years, and I have been deeply involved in the research and publication process, with particular interests in open science and dynamic documents. (PhD UC Berkeley Lecturer University of Essex, Senior Lecturer, University of Exeter). My research has mainly been in Economics, but also involving other disciplines (especially Psychology).
I’m a Senior Economist at Rethink Priorities, where I’ve worked for the past year, engaging with a range of researchers and practitioners at RP and other EA groups
My research has involved EA-relevant themes since the latter part of my PhD. I’ve been actively involved with the EA community since about 2016, when I received a series of ESRC ‘impact grants’ for the innovationsinfundraising.org and giveifyouwin.org projects, working with George Howlett and the CEA

I have had long 1-1 conversations on this idea with a range of knowledgable and relevant EAs, academics, and open-science practitioners, and technical/software developers including

List of people consulted

Cooper Smout, head of ‘, which I’d like to ally with (through their pledges, and through an open access journal Cooper is putting together, which the Unjournal could feed into, for researchers needing a ‘journal with an impact factor’)
Participants in the GPI seminar luncheon
Daniela Saderi of PreReview

Other ways I can learn about you

, my online CV has links to almost everything else@givingtools on twitter david_reinstein on EA forum; see post on this: I read/discuss this on my podcast, e.g., see \

How much money do you need?

Feel free to give either a simple number, or a range, a complicated answer, or a list of what could be done with how much

Over a roughly one-year ‘pilot’ period, I propose the following. Note that most of the costs will not be incurred in the event of the ‘failure modes’ I consider. E.g., if we can’t find qualified and relevant reviewers and authors, these payments will not be made

$15k: Pay reviewers for their time for doing 50 reviews of 25 papers (2 each), at 250 USD per review (I presume this is 4-5 hours of concentrated work) --> 12,500 USD

$5k to find ways to ’buy off” 100 hours of my time (2-3 hours per week over some 50 weeks) to focus on managing the project, setting up rules/interface, choosing projects to review, assigning reviewers, etc. I will do this either through paying my employer directly or ‘buying time’ by getting delivery meals, Uber rides, etc.)

$5k to ’buy off” 100 hours of time from other ‘co-editors’ to help, and for a board to meet/review the initial operating principles

$5k: to hire about 100 hours technical support for 1 year to help authors host and format their work, to tailor the ‘experimental’ space that PreReview has promiosed us, and potentially working with the EA forum and other interfaces

$2.5k: Hire clerical/writing/copy editing support as needed

$7.5k: rewards for ‘authors of the best papers/projects’ (e.g., 5 * 1000 USD … perhaps with a range of prizes) … and/or additional incentives for ‘best reviews’ (e.g., 5 * 250 USD)

Links to any supporting documents or information

We have an action plan (mainly for EA organizations) and a workspace in the GitBook here: This also nests several essays discussing the idea, including the collaborative document (with many comments and suggestions) at \

Estimate your probability of succeeding if you get the amount of money you asked for

Most of the measures of ‘small success’ are scaleable; the funds I am asking for, for referee payments, some of my time, etc., will not be spent/will be returned to you if we do not recieve quality submissions and commitments to review and assist in the management

My own forecast (I’ve done some calibration training, but these are somewhat off-the-cuff) 80% that we will find relevant authors and referees, and this will be a useful resource for improving and assesing the credibility of EA-relevant research

60% that we will get the academic world substantially involved in such a way that it becomes reasonably well known, and quality academic researchers are asking to ‘submit their work’ to this without our soliciting their work.

50% that this becomes among the top/major ways that EA-aligned research organizations seek feedback on their work (and the work that they fund — see OpenPhil), and a partial alternative to academic publication

10-25% that this becomes a substantial alternative (or is at the core of such a sea-change) to traditional publication in important academic fields and sub-fields within the next 1-3 years. (This estimate is low in part because I am fairly confident a system along these lines will replace the traditional journal, but less confident that it will be so soon, and still less confident my particular work on this will be at the center of it.) \

Can I include your proposal in ACXG++ ?

If your proposal wins, can I post about it on the blog?

Anything else I should know?

Why these guidelines/metrics?

31 Aug 2023: Our present approach is a "working solution" involving some ad-hoc and intuitive choices. We are re-evaluating the metrics we are asking for as well as the interface and framing. We are gathering some discussion in this linked Gdoc, incorporating feedback from our pilot evaluators and authors. We're also talking to people with expertise as well as considering past practice and other ongoing initiatives. We plan to consolidate that discussion and our consensus and/or conclusions into the present (Gitbook) site.

Why numerical ratings?

Ultimately, we're trying to replace the question of "what tier of journal did a paper get into?" with "how highly was the paper rated?" We believe this is a more valuable metric. It can be more fine-grained. It should be less prone to gaming. It aims to reduce randomness in the process, through things like 'the availability of journal space in a particular field'. See our discussion of .

To get to this point, we need to have academia and stakeholders see our evaluations as meaningful. We want the evaluations to begin to have some value that is measurable in the way “publication in the AER” is seen to have value.

While there are some ongoing efforts towards journal-independent evaluation, these tend not use comparable metrics. Typically, they either have simple tick-boxes (like "this paper used correct statistical methods: yes/no") or they enable descriptive evaluation without an overall rating. As we are not a journal, and we don’t accept or reject research, we need another way of assigning value. We are working to determine the best way of doing this through quantitative ratings. We hope to be able to benchmark our evaluations to "traditional" publication outcomes. Thus, we think it is important to ask for both an overall quality rating and a journal ranking tier prediction.

Why these categories?

In addition to the overall assessment, we think it will be valuable to have the papers rated according to several categories. This could be particularly helpful to practitioners who may care about some concerns more than others. It also can be useful to future researchers who might want to focus on reading papers with particular strengths. It could be useful in meta-analyses, as certain characteristics of papers could be weighed more heavily. We think the use of categories might also be useful to authors and evaluators themselves. It can help them get a sense of what we think research priorities should be, and thus help them consider an overall rating.

However, these ideas have been largely ad-hoc and based on the impressions of our management team (a particular set of mainly economists and psychologists). The process is still being developed. Any feedback you have is welcome. For example, are we overemphasizing certain aspects? Are we excluding some important categories?

We are also researching other frameworks, templates, and past practice; we hope to draw from validated, theoretically grounded projects such as .

Why ask for credible intervals?

In eliciting expert judgment, it is helpful to differentiate the level of confidence in predictions and recommendations. We want to know not only what you believe, but how strongly held your beliefs are. If you are less certain in one area, we should weigh the information you provide less heavily in updating our beliefs. This may also be particularly useful for practitioners. Obviously, there are challenges to any approach. Even experts in a quantitative field may struggle to convey their own uncertainty. They may also be inherently "poorly calibrated" (see discussions and tools for ). Some people may often be "confidently wrong." They might state very narrow "credible intervals", when the truth—where measurable—routinely falls outside these boundaries. People with greater discrimination may sometimes be underconfident. One would want to consider and potentially correct for poor calibration. As a side benefit, this may be interesting for research in and of itself, particularly as The Unjournal grows. We see 'quantifying one's own uncertainty' as a good exercise for academics (and everyone) to engage in.

"Weightings" for each rating category (removed for now)

Weightings for each ratings category (removed for now)

2 Oct 2023 -- We previously suggested 'weightings' for individual ratings, along with a note

We give "suggested weights" as an indication of our priorities and a suggestion for how you might average these together into an overall assessment; but please use your own judgment.

We included these weightings for several reasons:

People are found [reference needed] do a more careful job at prediction (and thus perhaps at overall rating too) if the outcome of interest is built up from components that are each judged separately.

Adjustments to metrics and guidelines/previous presentations

Oct 2023 update - removed "weightings"

We have removed suggested weightings for each of these categories. We discuss the rationale at some length .

Evaluators working before October 2023 saw a previous version of the table, which you can see .

Dec. 2023: Hiding/de-emphasizing 'confidence Likerts'

We previously gave evaluators two options for expressing their confidence in each rating:

Either:

The 90% Confidence/Credible Interval (CI) input you see below (now a 'slider' in PubPub V7) or

Pre-October 2023 'ratings with weights' table, provided for reference (no longer in use)

Category (importance)

Sugg. Wgt.*

Rating (0-100)

90% CI

Confidence (alternative to CI)

We had included the note:

We give the previous weighting scheme in a fold below for reference, particularly for those reading evaluations done before October 2023.

As well as:

Suggested weighting: 0. Why 0?

Elsewhere in that page we had noted:

As noted above, we give suggested weights (0–5) to suggest the importance of each category rating to your overall assessment, given The Unjournal's priorities. But you don't need, and may not want to use these weightings precisely.

The weightings were presented once again along with each description in the section .

Pre-2024 ratings and uncertainty elicitation, provided for reference (no longer in use)

Category (importance)

Rating (0-100)

90% CI

Confidence (alternative to CI)

[FROM PREVIOUS GUIDELINES:]
You may feel comfortable giving your "90% confidence interval," or you may prefer to give a "descriptive rating" of your confidence (from "extremely confident" to "not confident").
Quantify how certain you are about this rating, either giving a 90% / interval or using our . (We prefer the 90% CI. Please don't give both.

[Previous guidelines] "1–5 dots": Explanation and relation to CIs

5 = Extremely confident, i.e., 90% confidence interval spans +/- 4 points or less

4 = Very confident: 90% confidence interval +/- 8 points or less

3 = Somewhat confident: 90% confidence interval +/- 15 points or less

2 = Not very confident: 90% confidence interval, +/- 25 points or less

1 = Not confident: (90% confidence interval +/- more than 25 points)

[Previous...] Remember, we would like you to give a 90% CI or a confidence rating (1–5 dots), but not both.

[Previous guidelines] Example of confidence dots vs CI

The example in the diagram above (click to zoom) illustrates the proposed correspondence.

And, for the 'journal tier' scale:

[Previous guidelines]: Reprising the confidence intervals for this new metric

From "five dots" to "one dot":

5 = Extremely confident, i.e., 90% confidence interval spans +/– 4 points or less*

4 = Very confident: 90% confidence interval +/– 8 points or less

3 = Somewhat confident: 90% confidence interval +/– 15 points or less

2 = Not very confident: 90% confidence interval, +/– 25 points or less

Previous 'descriptions of ratings intervals'

[Previous guidelines]: The description folded below focuses on the "Overall Assessment." Please try to use a similar scale when evaluating the category metrics.

Top ratings (90–100)

95–100: Among the highest quality and most important work you have ever read.

90–100: This work represents a major achievement, making substantial contributions to the field and practice. Such work would/should be weighed very heavily by tenure and promotion committees, and grantmakers.

For example:

Most work in this area in the next ten years will be influenced by this paper.

Near-top (75–89) (*)

This work represents a strong and substantial achievement. It is highly rigorous, relevant, and well-communicated, up to the standards of the strongest work in this area (say, the standards of the top 5% of committed researchers in this field). Such work would/should not be decisive in a tenure/promotion/grant decision alone, but it should make a very solid contribution to such a case.

Middle ratings (40–59, 60–74) (*)

60–74.9: A very strong, solid, and relevant piece of work. It may have minor flaws or limitations, but overall it is very high-quality, meeting the standards of well-respected research professionals in this field.

40–59.9: A useful contribution, with major strengths, but also some important flaws or limitations.

Low ratings (5–19, 20–39) (*)

20–39.9: Some interesting and useful points and some reasonable approaches, but only marginally so. Important flaws and limitations. Would need substantial refocus or changes of direction and/or methods in order to be a useful part of the research and policy discussion.

5–19.9: Among the lowest quality papers; not making any substantial contribution and containing fatal flaws. The paper may fundamentally address an issue that is not defined or obviously not relevant, or the content may be substantially outside of the authors’ field of expertise.

0–4: Illegible, fraudulent, or plagiarized. Please flag fraud, and notify us and the relevant authorities.

(*) 20 Mar 2023: We adjusted these ratings to avoid overlap

The previous categories were 0–5, 5–20, 20–40, 40–60, 60–75, 75–90, and 90–100. Some evaluators found the overlap in this definition confusing.

The Unjournal: project and communication space

The Unjournal

An Introduction to The Unjournal

Content overview

Key sections and subsections

Independent evaluations (trial)

Initiative: ‘independent evaluations’

Who should do these evaluations?

Why should you do an evaluation?

Which research?

What sort of ‘evaluations’ and what formats?

How will The Unjournal engage?

1. Posting and signal-boosting

2. Offering incentives

3. Providing materials, resources and guidance/feedback

4. Partnering with academic institutions

How does this benefit The Unjournal and our mission?

About The Unjournal (unfold)

Frequently Asked Questions (FAQ)

General FAQs

What is The Unjournal?

Does The Unjournal charge fees?

Is The Unjournal a journal?

How is The Unjournal funded? Is it sustainable?

I have another question.

Specific FAQs

For research authors

Can I ask for evaluations to be private?

Do you make any requests of authors?

Why should I submit my work to The Unjournal? Why should I engage with them?

Is it risky to have my work evaluated in The Unjournal?

Exclusivity

Public negative feedback

I received the message "Your paper is being evaluated by The Unjournal". What does this mean?

I received the message "The Unjournal can confirm we have received your submitted manuscript." What does this mean?

What if we have revised the paper since the version evaluated by The Unjournal?

Why should I provide an "author's response" to evaluations?

What should an 'author response' look like?

Project submission, selection and prioritization

Submission/evaluation funnel

What research to target?

High-level considerations for prioritizing research

Ultimate goals: what are we trying to optimize?

Prioritizing research to achieve these goals

A. Direct impact (‘score goals now’)

B. Sustainability: funding, support, participation

C. Credibility, visibility, driving positive institutional change

Data: what are we evaluating/considering?

Title

What is global-priorities-relevant research?

A candidate description/definition; 'direct global impact of research'

Others' takes on this question, resources...

Syllabi

EA-linked

Rhys-Bernard: reading syllabus -

Phil Trammel: "Topics in Economic Theory & Global Prioritization" -

Other representative and relevant syllabi

Funder/organization research agendas

(2020)

Open Philanthropy

Other agendas and discussions

General advice

Psychology and well-being

Economics: overviews and prioritization exercises

Reinstein's slides/outline of this field and opportunities

See also (internal/Unjournal discussions)

Action and progress

Gantt Chart of next steps

Grants and proposals

The Unjournal

An Introduction to The Unjournal

Content overview

Key sections and subsections

Other resources and reading

Detail, progress, and internal planning

In a nutshell

How do we do this?

Change is hard: overcoming academic inertia

Our webpage and our objectives

Where do I find . . . /where do I go next?