‘Operationalizable’ questions
Guidelines
At least initially, we’re planning to ask for questions that could be definitively answered and/or measured quantitatively, and we will help organizations and other suggesters refine their questions to make this the case. These should approximately resemble questions that could be posted on forecasting platforms such as Manifold Markets or Metaculus. These should also somewhat resemble the 'claim identification' we currently request from evaluators.
Phil Tetlock’s “Clairvoyance Test” is particularly relevant. As :
if you handed your question to a genuine clairvoyant, could they see into the future and definitively tell you [the answer]? Some questions like ‘Will the US decline as a world power?’...‘Will an AI exhibit a goal not supplied by its human creators?’ struggle to pass the Clairvoyance Test… How do you tell one type of AI goal from another, and how do you even define it?... In the case of whether the US might decline as a world power, you’d want to get at the theme with multiple well-formed questions such as ‘Will the US lose its #1 position in the IMF’s annual GDP rankings before 2050?’.... These should also somewhat resemble the 'claim identification' we currently request from evaluators.
Metaculus and Manifold: .
Discussion with examples
Some questions are important, but difficult to make specific, focused, and operationalizable. For example (from 80,000 Hours’ list of “research questions”):
“What can economic models … tell us about recursive self improvement in advanced AI systems?”
“How likely would catastrophic long-term outcomes be if everyone in the future acts for their own self-interest alone?”
“How could AI transform domestic and mass politics?”
Other questions are easier to operationalize or break down into several specific sub-questions. For example (again from 80,000 Hours’ “research questions”):
Could advances in AI lead to risks of very bad outcomes, like suffering on a massive scale? Is it the most likely source of such risks?
I rated this a 3/10 in terms of how operationalized it was. The word “could” is vague. “Could” might suggest some reasonable probability outcome (1%, 0.1%, 10%), or it might be interpreted as “can I think of any scenario in which this holds?” “Very bad outcomes” also needs a specific measure.
However, we can reframe this to be more operationalized. E.g., here are some fairly well-operationalized questions:
What is the risk of a catastrophic loss (defined as the death of at least 10% of the human population over any five year period) occurring before the year 2100?
How does this vary depending on the total amount of money invested in computing power for building advanced AI capabilities over the same period?
Here are some highly operationalizable questions developed by the Farm Animal Welfare team at Open Phil:
What percentage of plant-based meat alternative (PBMA) units/meals sold displace a unit/meal of meat?
What percentage of people will be [vegetarian or vegan] in 20, 50, or 100 years?
And a few more posed and addressed by Our World in Data:
How much of global greenhouse gas emissions come from food? (full article)
What share of global CO₂ emissions come from aviation? (full article)
However, note that many of the above questions are descriptive or predictive. We are also very interested in causal questions such as
What is the impact of an increase (decrease) in blood lead level by one “natural log unit” on children’s learning in the developing world (measured in standard deviation units)?
Last updated