Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Some interventions are aimed at getting people to consider effectiveness in their giving and make donations to effective causes in particular
Loading...
Some interventions are aimed at getting people to make substantial contributions, or pledge to do so (e.g., GWWC pledge) ... to effective charities
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Luke Freeman is the lead contact.
Giving What We Can's mission is to make giving effectively and significantly a cultural norm. GWWC has updated their 2022 strategy. They are looking to significantly increase their marketing activity by producing videos, funding ads, and conducting systematic and robust research. As such there will be a large crossover between our work and theirs. This section highlights our collaborative efforts.
We want to learn from existing work, run tests on the GWWC platform, and support research into this.
Awareness & Consideration
Increase casual visitors and raise curiosity
Conversion & Acquisition
Donate or pledge to donate
Retention
Fulfill and report pledge
Advocacy
Promoting GWWC to others
“What should the call to action be for the casual person in the funnel?”
Testing all parts of funnel/pledge journey; website, welcome messages/welcome packages, reminders and thank-you's
We are a group of researchers and practitioners across a range of fields (Economics, Psychology, Marketing, Statistics) and organizations, particularly those interested in effective charitable giving and effective altruism. This is outlined in the Airtable (invite link), with embedded views below.
This project is organized by David Reinstein, who maintains this wiki/Gitbook and other resources.
As individuals and organizations, we are goal-driven and impact-driven: we are in this to improve the world, particularly through directing funds and support to the most effective rercauses and interventions. Because we share these common goals, we are better aligned for collaboration than typical academics and charitable organizations. We have an unprecedented opportunity to collaborate, learn what works, and 'move the needle'.
We are actively collaborating with the following organizations (links indicate publicly reportable trials)
Charity Elections (run by GWWC)
University of Chicago Effective Altruism (group)
Chloë Cudaback is the lead contact (communications manager). (Previously Jack Lewars)
OftW has a donor base of ~700 active donors, ~1650 pledged donors (who pledged but haven't started donating yet) and ~2000 lapsed donors.
80% (of donors?) are in the USA
Focus on global health charities
They focus on donations to GiveWell charities ... but technically OftW pledgers can give to any 510c3
Reinstein/Lewars conversation notes
Activating more donors who took the pledge at university, so their donations actually start;
Retaining donors for longer once they activate;
Upselling donors to give more over time (either more as a raw amount, e.g. 'keeping pace' at 1% of their income; or more as a percentage, e.g. 'graduating' to take the 10% GWWC pledge)
Acquiring new donors with fewer touchpoints, e.g. via online advertising, via our website etc. (we currently get ~0 organic sign-ups)
Pledgers
Active donors, i.e., "Activated pledgers" (Chloe is thinking of segments to this and how to appeal to them)
Second tier -- people who have given each month for 12+ months; "Legacy donors" (DR: maybe 1x per year high-value donors should be in this group)
One-time donors (these may or may not be pledgers)
Cancelled
Payment failures
Another group worth considering: 'pledge-curious supporters'
'Activating' Pledgers as donors (pledged but not donated)
Active donors
Retain
Upsell (maybe only to the second tier?)
Acquiring pledges, perhaps from a 'pledge-curious group'
Content -- expand our ability to tell stories about the beneficiaries
Ways to tell these stories
Frequency (of comms with supporters)
Platforms: Social media, email flows
Telling stories in a corporate context
Typical audiences have been students and young professionals, but there is interest in corporate outreach
Zoom and lunchtime talks in corporate contexts (How many? Seems very promising!)
How many people are activating/pledging following these lunch+learn?
We are in the process of creating these homepages and setting up conversion tracking. As OFTW has ~0 organic sign ups currently, we are testing for a variety of conversion routes, including: [Todo: clarify this]
university campus, someone I like tells me they are involved in OftW, asks me to come along with free food
at some point I take the pledge
It is not a highly controlled process
asking us (staff) a question by email
joining a group call with others wanting to learn about effective donating (Kennan as dir. of chapter management)
taking the pledge
making a one-off donation
650 active donors
1500 people in pipeline (pre-activation date)
750 new people a year are recruited... thinks it would be 2-2.5k
OFTW has a donor base of ~700 active donors,
~1650 pledged donors (who pledged but haven't started donating yet) and
~2000 lapsed donors.
Homepage message testing
Activation trial
See also
Our public reports of trials and analysis in the web-book here
The Barriers to Effective Giving living web book
Our regularly updated 'data analysis report' on all the trials and evidence, which you can download HERE as a protected zip file (need to request password, permission granted with consent of participating organizations)
We are a small group of researchers and practitioners. This project is organized by David Reinstein , who maintains this wiki/Gitbook and other resources.
We aim to promote awareness and understanding of the (EA), and "to make giving effectively and significantly a cultural norm." We consider marketing campaigns, charitable appeals, events, and public communication, working both with our partner organizations and in independent surveys and trials. We want to improve the design and messaging of organizations like Giving What We Can and 80000 Hours to improve their outreach methods and maximize their impact. Measuring and testing 'what works and when': While we are also testing and analyzing this rigorously. We help run and track careful data collection and rigorous controlled trials, as well as helping to organize the reporting of less rigorous trials. We robustly analyze the results to better understand which approaches tend to have a more positive impact. Communication: We track, organize, and share what we have learned with the EA community, building and organizing resources and a knowledge base. This will address questions such as:
How to implement and test marketing campaigns?
What has worked to promote EA?
What profiles of people are most likely to be interested in effective altruism?
We strive to be transparent. We want to report and share our data, procedures, code, and evidence without overselling the results.
Coalesce our understanding and evidence on barriers and facilitators of effective altruism, effective giving, and effective action
Run a broad set of high-powered trials (large samples, high-stake real-world contexts, substantial differences between conditions)
... to gather evidence on what works best to promote meaningful actions in specific cases,
... while aiming at
Do profiling, survey, and segmentation activities and trials, building evidence on 'which types of people' are most responsive to effective giving messages and appeals
Share our results, data, and tools, with the relevant EA and research-interested communities. This will enable more and better outreach, promotion, testing, and insight.
As EAMT has progressed, we have encouraged others to do work and pursue initiatives in the 'space' of studying EA messaging, and marketing EA and effective giving. We hope that the resources we have provided, and the connections we have made have contributed to this. As the space changes, the EAMT mission, scope, and activities are adjusting as well.
We are moving towards a heightened focus on
Advising, proposing, and helping to design and coordinate experiments, trials, and initiatives.
Transparent presentation of the results, rigorous statistical analysis
Synthesizing, sharing, and communicating this knowledge and skills base
This work provides substantial public goods, whose benefit is shared among the partner organizations and the EA community.
Other relevant/new organizations and initiatives
#other-marketing-implementation-resources (including User Friendly, Good Impressions, and Altruistic Agency)
Giving What We Can 'Bequests' (a project we have encouraged and advised)
Effective Giving Collaboration and Summit
We believe the EA Market Testing Team is the first organized collaboration of its kind.
Goals/FAQ link (below in detail, scroll outside margin to skip past it)
For an overview of our progress and ongoing work, see the 'progress and results' document we are building. (Below in detail, scroll outside margin to skip past it)
Note that we cannot publicly share details of ongoing and upcoming trials. We aim to share the results when it is possible. We aim to integrate shareable aspects of this private doc.
For a data-driven dynamic document covering (some of) our trials and evidence see HERE.
If you are interested in getting involved with our project or have feedback for us, contact David Reinstein at daaronr AT gmail.com.
(For an explanation of this Gitbook's structure, content, and aims.
This quote comes from the 2022 Giving What We Can strategy document.
However, we are also careful to be efficient, recognizing the tradeoffs between rigorous experimental design and practical marketing.
Including testing templates, guides and implementation tips
For a data-driven dynamic document covering (some of) our trials & evidence see HERE.
In the Partner Organizations and Trials section, you will find reports of the trials we have run with organizations, including Giving What We Can and One For the World (OftW).
These trials are also cataloged : Shared view of the relevant trials (); .
We want to identify the most effective and scalable strategies for marketing EA and EA-adjacent ideas and actions. To do this, we believe that running real-world marketing trials and experiments with EA-aligned organizations will provide the best evidence to act upon. By systematically varying the messaging, framing, and contexts, we can map out 'what works better where'.
We believe this approach is likely to be the most fruitful because:
Using naturally-occurring populations in real-world settings with meaningful costly choices and outcomes will lead to more relevant findings. In comparison to convenience samples of undergraduates or professional survey participants who are aware that they are doing a research study, we anticipate greater:
Internal validity: our results are less likely to be influenced by biases, such as acquiescence bias and hypothetical decision-making.
: the context we are testing is similar or identical to the context we care about.
We will "learn by doing" by encountering unanticipated obstacles and learning about practical implementation issues involved with advertising, promotion, and communication.
We can share what we learn with relevant EA organizations and audiences. They then can build on our findings, rather than having to repeatedly make mistakes themselves.
The trials themselves should also have a direct positive value in promoting EA.
There is limited downside risk. We are generally not testing risky messages and are careful to avoid diluting or misrepresenting EA's core ideas.
This project primarily aims at:
Robust and generalizable insights that improve communication and messaging
Meaningful and relevant long-run outcomes, such as;
Creating new, strong EAs by getting people more interested and involved in EA ideas, actions, and the community
Having people consider and identify with key values and practices, such as making meaningful altruistic choices, considering effectiveness and impact in doing so, strong analytical and epistemic practices, and broad (or carefully considered) moral circles
Across a range of EA causes and groups (longtermism, global health, animal welfare)
In the document below (EAMT: HVQs), we consider the shared goals, paths, and questions that are valid across organizations. Specifically, these are actionable and promising themes and projects that can be implemented, measured, and communicated fluidly throughout the EA network.
You can now ask questions of this gitbook using a 'chatbot': click the search bar and choose 'lens'.
Our mission, what we are trying to do and why, most recent updates, and the organization of our team
Other key resources
EA Market Testing data analysis: dynamic document/notebook (Quarto site) covering our trials & evidence
Airtable view of the relevant trials (); links, categorization provided
In this section, you will find with organizations, including Giving What We Can and One For the World.
Here we share tools to implement planned trials, as well as tips relevant to 'doing marketing'. We answer questions like how to set up campaigns and track outcomes on various platforms. See "Implementing ads..." and "Collecting data..."
We discuss qualitative and quantitative research design and methodology issues that are relevant to the trials we are running. Pages in this section will be linked in reports when relevant to a particular trial.
Our profiling project aims to help better understand what sorts of people are amenable to EA-related ideas and to taking EA-favored actions.
We've done a review of existing literature: to inform the trials we are running, and to identify important research topics. This includes What is known/models of effective giving and Principles and theories behind potential trials.
The three key aims of this public gitbook are to:
Convey who we are, what have accomplished, and the scope of our work to funders, people in the broader EA community, and people not yet involved in the project who would be interested in joining
Share tools and knowledge with people in the EA/global priorities community who will apply it to their work. We are building a knowledge base. Content in the public gitbook can inform and support a diverse set of projects (i.e., implementing marketing campaigns, fundraising initiatives, academic research)
Seek feedback on our work. This includes technical and industry feedback on implementation and academic expertise (literature reviews and frameworks to consider, methodology, and experimental design).
(Grouped by organizational partner.)
We include background information on each organization and its priorities for testing.
Context: Donation 'upsell' to existing pledgers
Question: Are effectiveness-minded (EA-adjacent) donors and pledgers more motivated to donate by
"A": (non-quantitative) presentation of impact and effectiveness (as in standard OftW pitch)
"B": Emotional appeals and 'identified victim' images
Further information on experiment and outcomes in in-depth replicable analysis, organized in dynamic document here
Are effectiveness-minded (EA-adjacent) donors and pledgers more motivated to donate by
"A": (non-quantitative) presentation of impact and effectiveness (as in standard OftW pitch)
"B": Emotional appeals and 'identified victim' images
In the context of One for The World's (OFTW) 'giving season upselling campaign', potentially generalizable to other contexts.
Academic framing: "Does the Identifiable Victims Effect (see e.g., the meta-analysis by Lee and Feeley, 2016) also motivate the most analytical and committed donors?"
One for The World's (OFTW) 'giving season upselling campaign''
10 emails total over the course of November were sent in preparation for GivingTuesday
: November 10, 18, 23, all in 2021, but may be delayed for feasibility
Present Gitbook, Google doc linked below, preregistration (OSF), and github/git repo
Emails ... to existing OftW pledgers (asking for additional donations in Giving Season)
All 10 emails had the same CTA: make an additional $100 donation for the giving season/GivingTuesday on top of their recurring monthly pledge donation.
Roughly 4000 participants, as described.
A series of three campaign emails will be sent out by OftW to their regular email lists, to roughly 4000 participants, as described.
:
A list of ~4500 contacts (activated pledgers) was split into two treatment groups.
Treatment Group A received emails that were focused on the contact's impact
while Treatment Group B received emails that were focused on individual stories of beneficiaries
See preregistration, treatment specifics
See preregistration How many ... conditions
Targeting: Donation incidence and amount in the relevant 'giving season' and over the next year, specifically described in prereg under
Data storage/form:
MailChimp data (Chloe is sharing this),
Reports on donations (Kennan is gathering this)
Planned analysis methods, preregistration link here
Cost of running trial/promotion: Time costs only (as far as I know)
Pre-registered on OSF in 'AsPredicted' format, content incorporated here here
The Emotion treatment leads to significantly fewer people opening emails, but more people clicking on the in-email donation link (relative to the standard Impact information treatment). However, we are statistically underpowered to detect a difference in actual donations. More evidence is needed.
Chloe: those emails that appealed to emotional storytelling performed better (higher in-email click rate) than those that were impact-focused.
DR, update: I confirm that this is indeed the case, and this is statistically significant in further analysis.
Evidence on donations
(preliminary; we are awaiting further donations in the giving season) ...
This is 'hard-coded' below. I intend to replace this with a link or embed of a dynamic document (Rmarkdown). The quantitative analysis itself, stripped of any context and connection to OftW, is hosted HERE
Note: We may wish to treat the 'email send' as the denominator, as the differing subject seemed to have led to a different number of opens
Treatment 1 (Impact): We record
1405 unique emails listed as opening a ‘control’ treatment email
29 members clicking on the donation link in an email at least once (2.1% of openers)
15 members making some one-time donation in this period (about 0.11% of openers, 0.075% of total)
8 members emails donating (likely) through the link (0.057%/0.04%)
Treatment 2 (Emotional storytelling):
1190 unique emails listed as opening an email (a significantly lower 'open rate', assuming the same shares of members were sent each set of treatment email)
56 members clicking on the donation link in an email at least once (4.7% of openers)
11 members making some one-time donation in this period (about 0.9% of openers, about 0.055% of total)
9 unique emails donating (likely) through the link (0.08%/0.045%)
Note: We may wish to treat the 'email send' as the denominator, as the differing subject seemed to have led to a different number of opens
‘Initial impressions of preliminary outcomes’
The conversion rates are rather low (0.5%) … but maybe high enough to justify sending these emails? I’m not sure.
While people are more likely to O_pen_ at least one Impact email, they are more likely to Click to donate at least once if assigned the Emotion email
But we can't say much for actual donations.
Given the low conversion rates we don’t have too much power to rule out ‘proportionally large’ differences in conversion rates (or average amounts raised) between treatments …
The figure above seems like a good summary of the ‘results so far’ on ‘what we can infer about relative incidence rates’, presuming I understand the situation correctly …I plot Y-axis: ’how likely would a difference in donations ‘as small or smaller in magnitude’” than we see in the data between the incidence … against X-axis: if the “true difference in incidence rates” were of these magnitudes
Implementation and management: Chloe Cudaback, Jack Lewars
Our data is consistent with ‘no difference’ (of course) … but it's also consistent with ‘a fairly large difference in incidence’
E.g., even if one treatment truly lead to ‘twice as many donations as the other’, we still have a 33% chance or so of seeing a difference as small as the one we see
We can reasonably ‘rule out’ differences of maybe 2.5x or greater
Main point: given the rareness of donations in this context, our sample size doesn’t let us make very strong conclusions in either direction about donations
Leads: Bilal Siddiqi, Neela Saldhana; Other partner contact: Jon Behar (Giving Games)
We have completed various trials in conjunction with The Life You Can Save, the most recent being theAdvisor signup (Portland) city-level YouTube test. There are a number of additional proposed trials and tests, however, at the moment these considerations are limited to the private Gitbook.
Note that in the past TLYCS has worked with the Graduate Policy Workshop School of Public and International Affairs at Princeton University, who produced the 'Behavioral Insights to End Global Poverty' report embedded below.
Along with GWWC, we tested marketing and messaging themes on Facebook in their Effective Giving Guide Facebook Lead campaigns. Across four trials we compared the effectiveness of different types of (1) messages, (2) videos, and (3) targeted audiences.
A summary of this has been shared as a post on the EA Forum:
We build the results and analysis transparently in the EAMT Analysis web-book here.
Context: Facebook ads on a range of audiences
... [with text and rich content promoting effective giving and a "giving guide" -- links people to a Giving What We Can page asking for their email in exchange for the guide]
Objective: Test distinct aiming to get people to download our Giving Guide. A key comparison:
Also informative about costs and the 'value of targeting different groups' in this context.
Key findings:
The cost of an email address via a Facebook campaign during Giving Season was .
“Only 3% of people give effectively,” seems to be an effective message for generating link clicks and email addresses, relative to the other messages.
Lookalike and animal rights audiences seem to be the most promising audiences.
Demographics are not very predictive on a per-$ basis.
Specificity and interpretation: All comparisons are not for 'audiences of similar composition' but for 'the best audience Facebook could find to show the ads, within each group, according to its algorithm'. Thus, differences in performance may combine 'better targeting' with 'better performance on the targeted group'. See our discussion of the 'divergent delivery' problem HERE. I.e., we can make statements about "what works better on Facebook in this context and maybe similar contexts", , as the targeting within each audience may differ in unobserved ways.
The outcome is 'click to download the giving guide'.
Main Question: Do some message themes work better than others for drawing visitors to Giving What We Can’s landing page?
Main findings: 'Social proof messages' on Facebook ads were most effective at generating landing page views per dollar compared to other message themes (effectiveness, services, giving more, and values).
Future directions: There were significant differences in 'link clicks per dollar' on the different messages by age. We recommend a systematic test to determine if age makes a difference in the relative effectiveness of social proof and values messages. Future studies could explore why the social proof message was more effective in this study than the previous giving guide study and the importance of the message to “join” the movement as social proof.
Possible connection between this trial and the Giving guides - Facebook: Note that the two best-performing messages both prompted the user to “join” a movement or a group of people (perhaps an elite group); but beware ex-post theorizing.
Link to report below.
In this test, we are aiming to find out if one 'theme' of messages resonates better with our target audience than others.
If we knew which 'themes' were most effective with our advertising, then we could create more ads on this theme and improve our conversion.
Specifically, which of the following themes resonate with our target audience the most:
effectiveness
giving more
social proof
values
services
On choosing an objective of this test, originally I planned to use link clicks, but this is not the most high quality indicator of conversion, and when I tried to use newsletter signups Facebook warned me that I might not see any conversions at all... So instead, the campaign will optimise for landing page views, which is slightly better than a link click and will generate enough conversions that we should [see?] we statistically significant results.
Grace Adams
Trial will run for 7 days on GWWC's ad account, from 9.30am AEDT Friday 25 Feb to 9.30am AEDT Friday 4 Mar.
Working document can be found here but all important details will be listed in this brief
This test will take place on Meta platforms including Facebook and Instagram
We are targeting a "Philanthropy and Lookalikes (18-39)" audiences, based in UK, US or Netherland
Estimates from Facebook: Reach is expected to be 1.4K-4.1K per day (7 days) per ad set (5 ad sets) = 49K-143K
Estimates from Facebook: Conversion is expected to be 10-30 landing page views per day (7 days) per ad set (5 ad sets) = 350-1050
We are using the GWWC Brand Video by Hypercube as the creative across all tests. Although it did not perform as well as our other ads in the Giving Guide campaign, I think that it will interfere less with our messages we aim to test.
We are going to test a set of messages for each theme, please see them in the google doc linked
Mock up of ad:
This test has been set up as an A/B test through Facebook, testing each campaign head to head, each campaign covers one theme, with the different ads as a child.
This will allow us to test which theme was better, not just which individual ads
A/B testing on facebook will ensure that the audiences fall into an individual treatment group
Primary measure will be cost per landing page view, but secondary measures such as CPC, 3 second video plays, email sign ups will also be tracked
Data will live on Meta ads platform
Reinstein and others work with charity partners, some of which are not EA-aligned (but perhaps moderately effective), which inform EA giving. Several trials focus on the 'impact of impact information'
https://app.gitbook.com/u/WrM9GjKWCyRyoIjCKt7f0ddJwCr1's research (along with others) considers 'how do potential donors respond to (different presentations of) impact information'. Reinstein and his academic partners ran several experiments, working with (and on) mainstream charities and fundraising platforms.
See work:
and discussion:
Other work is ongoing and cannot be publicly shared yet (see private gitbook if you have access).
GWWC youtube remarketing campaign (trial)
See also the cross-organization (=placeholder for now) and the tips on
July 20, 2021: GWWC launched a YouTube remarketing campaign. That means that when someone goes to the GWWC website, leaves, and then goes to YouTube we show them one of the following videos:
Algorithm decides which video to present to people.
Q: Is each video assigned to a different situation or are videos randomly chosen to be displayed? If the latter, you could randomize videos by location and see if the different videos were more or less effective. Alternatively, just randomizing the whole campaign seems like a good idea to me....
A: Videos are selected based on the likelihood of the user watching >30 seconds (by the algorithm) ... randomization by individual will be hard because users don't click and act right away. Instead I think we have to randomize by geography
Most important takeaway: It costs $1 to get a website visitor to watch 1h of your videos! High level metrics
Cost: $205
Views: 6,071 (a view is when a user chooses to watch >30s of an ad)
Total watch time: 223 hours (~$1/h)
Unique viewers: 4,937 (this is an estimate)
Average impressions per user: 5.8
View rate: 20% (20% of the time people choose to watch more than 30s)
CTR: 0.37%
Average CPC: $1.83
Conversions (users spending >30s on the website): 2
Thinking: 'This is not a good tactic for driving site traffic or donations (although we could optimize for this instead if we wanted)'
Interesting observations\
Efficiency has significantly improved over 3 weeks
Cost per view has gone down from $0.05 per >30s view to $0.02 per >30s view
Views have increased 75% without increasing budget (from 220/day to start to 386 yesterday)
You can see this data by video if you are interested to control for video length
3. Your best video had a view rate (% of time people choose to watch >30s) twice as good as your worst video 4. You can see view rate by age, gender, and device in the "Analytics" tab
Possible next steps
Could add "similar audiences" which is when we let Google use machine learning to find people similar to your website visitors and also show ads to them
Could walk David Reinstein and Joshua Lewis through the UI so they can get a sense of the metrics/reporting available and how it could be used for research
Academic-linked authors: David Reinstein, Josh Lewis, potentially others going forward
Implementation and management: Chloe Cudaback, Jack Lewars
No, no data have been collected for this study yet.
Are effectiveness-minded (EA-adjacent) donors and pledgers more motivated to donate by
"A": A (non-quantitative) mention of impact and effectiveness (in line with the standard OftW pitch)
"B": Emotional appeals and 'identified victim' images
Framing this in terms of the psychology, social science, and philanthropy literature:
"Does the Identifiable Victims Effect (see e.g., meta-analysis by Lee and Feeley, 2016) also motivate the most analytical and committed donors?"
d_don_specific
: Whether the person receiving the series of emails makes an additional 'one time gift' following the link at OftW, within the OftW interface, during the 'Giving Season', a time-period that (for this preregistration) we declare to begin on receipt of this first email and end on 15 January 2022.
don_specific
: The total amount donated through the above
don_general_gs
: (If observable), the amount the person donates during the 'Giving Season', as observed through the OftW/donational/Plaid network
don_general_1yr
: (If observable), the amount the person donates during the 'Giving Season' and for the following year (ending 15 January 2023) as observed through the OftW/donational/Plaid network
d_continue_pledge_1yr
: Whether the person is still an active OftW pledger a year after the current giving season (15 January 2023)
Two conditions (treatments):
A. "Impact"
B. "Story/Emotion"
Assignment details
Participants (c 4000 people at various points in the One for the World pledge process) will be split into groups (blocks) by previous donation behavior or point in the process. (OftW have mentioned, pledgers still in school, active donors, and lapsed donors).
Within each group, they will be randomized (selection without replacement to ensure close-to-exact shares) into equal shares in treatments A and B.
A series of three emails will be sent, with participants remaining in the same treatment across all three emails.
Example content differences, from email 1:
A. Impact version:
As of 2021, One for the World has had a tremendous impact on the lives of those that are helped by our charity Top Picks programs:
[IMPACT SINCE 2021 GRAPHIC]
B. Story/Emotion version:
Here’s our first story this season from Eunice of Kenya. When asked how her life changed when she received the first cash transfer from our partner organization, GiveDirectly, she responded”
“I have been able to make new goals and achieve them since I started receiving this money [from GiveDirectly]. I have been able to buy a piece of land that would have taken [me] many years to earn [enough to buy the land]. I was also able to buy livestock, like goats. I have even managed to dress my family properly by buying them decent clothing. Lastly, I have even been able to [pay my children’s] school fees without any strain.” (Source GiveDirectlyLive)
[PICTURE OF EUNICE]
We will report all of the following analyses, with our preferred method in bold:
Binary outcomes:
Fisher's exact test
Continuous outcomes:
Standard rank-sum tests (Mann–Whitney U test)
Simulation/permutation based tests for whether the mean (including 0's) is higher in group A or B (including 0's)
... same for median, but medians will almost always be 0, we anticipate
T-test with unequal variance
All tests will be 2-sided.
We will also report Bayesian credible intervals and other Bayesian measures for the proportion tests. We may also explore Bayesian approaches for the continuous outcomes, e.g., Bayesian beta regression.
We also anticipate reporting multiple-hypothesis-test corrections, but we are not pre-registering a method. Our approach to this is likely to follow that of List et al (2017), which this paper applied to a similar domain (charitable giving experiments with multiple donation-related outcomes).
We will report confidence intervals on our results as well as Bayesian credible intervals under flat and weakly informative priors. Where we have a 'near-zero' result, we will try to put reasonable bounds on it to convey the extent of our certainty that the true effect or parameter was fairly small.
Included: All individuals who received this mailing.
\
We will not exclude any observations from the sample, unless they make it clear to us that they are aware of this trial.
We will not Windsorise or exclude outliers.
A series of three campaign emails will be sent out by OftW to their regular email lists, to roughly 4000 participants, as described above
Targeted dates: November 10, November 18, November 23, all in 2021, but these may be delayed for feasibility
Anything else you would like to pre-register? (e.g., secondary analyses, variables collected for exploratory purposes, unusual analyses planned?)
Secondary hypotheses and questions
Which treatment motivates a higher rate of...
Email open rates (note, as we have three obs per participant, we will need random effects or clustered standard errors). and
Use click rates (with same caveat)?
We consider these as secondary because the click and open rates do not necessarily strongly relate to outcomes of interest, particular among this set of already effectiveness-minded donors. These outcomes may simply reflect attention or curiosity about the content.
\
Exploratory: what factors (especially gender, university/student status, university subject) predict which treatment leads to greater donation (incidence and amount)
Note that our partner is planning to use this trial to inform future trials and experiments, particular for the 'Giving Tuesday' season itself.
We did not have time to do even simple power calculations before the start date of this experiment. However, we will try to conduct these before we obtain any of the data, and update this preregistration.
TLYCS ran a campaign in a single city involving 'donation advice'
We recoded and augmented this analysis More work could be done, if warranted.
In December 2021, TLYCS ran a YouTube advertising campaign in Portland Oregon, involving ‘donation advice’. The top 10% household-income households were targeted with (one of) three categories of videos. One of the ultimate goals was to get households to sign up for a 'concierge' personal donor advising service
There were very few signups for the concierge advising service. (About 16 in December 2021 , only 1 from Portland.)
We consider a 'difference in difference', to compare the year-on-year changes in visits to TLYCS during this period for Portland vs other comparison cities.
This comparison yields a 'middle estimate cost' of $37.7 per additional visitor to the site. This seems relatively expensive. We could look into this further to build a more careful model and consider statistical bounds, if such work was warranted.
Specific goal of TLYCS promotion: To get people to click on the ad and go to the 'landing page' of TLYCS. Here, they will fill out to request an appointment with a donation advisor. We will simultaneously be raising awareness for TLYCS.
General questions:
Can we get people to sign up for donation advice using videos in YouTube Ads?
How many sign-up and what sorts of people?
Do these ads boost engagement with TLYCS in net? (E.g. donations, website activity, book downloads)
"Lift test" on Portland market (analyze with difference-in-difference relative to other markets)
Which ads are best at this? (These ads differ in substance as well as in style)
Location: Portland, OR
Audience: Top 10% of household income
People living in Portland, Oregon in the top 10% of household income (approximated by Google) will get an in-stream ad (ad plays before video user intended to watch)
Exposure to a sequence of nine versions of YouTube ad videos. Frequency cap: 6/weeks
Three main 'theme/header' variations (similar, slightly different phrasings)
these variations were crossed with...
Three categories of videos within each theme:
"Bravery": Charlie Bresler explains how 'you can save lives without being brave' with small amounts of money for bednets, nutrient micro-doses, etc.
$10: Man giving out money to poverty-stricken people in Capetown. Text narrative overlaid describes that $5 can buy a slice of pizza, or an interocular lens to treat cataracts, etc. Leans towards 'identified victims/recipients'.
"I want to do good": Colorful puppets sing about giving and donating to save lives. Counters common arguments about 'breeding dependency', fear of administrative waste, etc.
Note/limitation: Unfortunately, we were not able to track 'which video got more clicks'.
Each video comes with a site-link extension with a Call to Action:
We assigned the particular video treatments to audiences using a YouTube/Google optimization algorithm. This chose videos to maximize the probability that a user chose 'Speak to an Advisor' and filled out the linked form.
How long people watched the videos for
Whether they 'clicked through'
Whether they filled out the form for advising (Algorithm is serving to optimize this)
A first pass and upper bound on impact and (lower bound on) cost/session
Assumptions/data interpretations
The numbers used in our data come from meaningful sessions from unique users
The 'date range' is the relevant one for being affected by the advertisements of interest
The 'comparison cities' are approximately randomly selected
Most optimistic (unrealistic) bound
Guiding assumption: a counterfactual 0 visits from Portland in this season
306 Portland Users (389 Portland site visits) in relevant 2021 period.
If these were all driven by the advertisement (and counterfactual was 0 visits), this is +306 Users and +389 visits
Cost $4k
-->Lower bound on cost of $13.07 per user ($10.28 per visit)
Year-on-Year (maybe reasonable) optimistic bound
Guiding assumption: a counterfactual 'sam as last year' in Portland
306 Portland Users (389 Portland site visits) in relevant 2021 period.
144 Portland Users (189 Portland site visits) in relevant 2020 period.
--> 306 - 144 =162 users uptick,
(389 - 189 = 200 visits uptick)
--> $4k/162 = $24.69 Lower bound on cost per user
($4k/200 = $20 per visit)
Difference in Differences comparison to other cities
Guiding assumptions:
The cities used are fairly representative
'Uptick as a percentage' is unrelated to city size/visits last year
All the cities in the comparison group are 'informative to the counterfactual' in proportion to their total number of sessions
This yields
112.5% visits uptick (Year on Year) for Portland in 2020
For all North American cities other than Portland (with greater than 250 000 people):
The average is 46.5 users in the 2020 period and 64.5 users in the 2021 period, an uptick of about 38.8%. This is very similar to the result if we look at all cities which has an uptick of 43.1%.
38.8% uptick multiplied by 144 users = 55.9 (‘counterfactual uptick’ in users for Portland)
162 - 55.9 = 106 (uptick relative to counterfactual)
USD 4000 /106 = 37.7 USD cost per additional user through this ad
Note this is a midpoint estimate, we have not yet given statistical bounds.
There were very few signups for the concierge advising service. Only about 16 in December 2021 globally, only 1 of which was from Portland.
Other detailed notes are in our private Gitbook. More formal and detailed analysis could be done if it seems merited.
The presentation of options on GWWC's '' were randomly varied at the individual browser level over a certain period to see which option increased pledges.
A summary of this has been shared as a on the EA Forum.
Giving What We Can (GWWC) has three giving pledge options, displayed in the 'Original presentation version' below.
From April-July 2021 they ran a trial presenting its 'pledge page' options in three slightly different ways. Considering 'clicks on any button' as the outcome, and a Bayesian 'preponderance of evidence' standard...
"Separate Bullets for Other Pledges" was the most successful presentation. It only showed a box for "The Pledge", with the other options given in less prominent bullet points below. This had about a 20% higher incidence rate than the Original presentation.
"Pledge before Try Giving" was the least successful presentation this was like the one displayed above, but with "Try Giving" in the central position. This had about a 23% lower incidence rate than the Original presentation.
GWWC: How can we present pledge options to maximize positive outcomes (pledges, fulfillment)?
General: For those considering making substantial giving pledges (of a share of their income), how does the presentation of these 'pledge options' matter?
Theories and mechanisms to consider:
Too many options may lead to 'indecision paralysis'
The signaling power of choice; e.g., if there's a 'more virtuous choice' I may feel that my 'middle choice' looks less good by comparison
1. "Try Giving" (1% of income),
2. "The Pledge" (10% of income)
3. The "Further Pledge" (donate all income above a living allowance).
Three versions of this page were randomly presented (between 19-21 April and 10 July 2021)
The content of the key 'choice button' part varied between these three versions
"Original:" A block of three (in the order of commitment) 'The Pledge' (10%) in the center and highlighted (see above)
"Pledge before TryGiving": A block of 3 with "Try Giving" (1%) in the center and highlighted
"Separate Bullets for Other Pledges": A single block for 'The Pledge' (10%), with the other pledges given as clickable bullet points below (as well as a bullet for the 'company pledge' ... which had a different presentation in other versions)
The version presented stayed constant according to an individual's IP cookie tracking.
'Everyone going to the above page' within the above time duration.
People interested in GWWC pledges'
Sample size: see below, from Google Analytics
"Original" (Block of 3 in order of commitment, Middle Pledge in Center)
2. "Pledge before TryGiving" ... as above but with Try Giving and The Pledge swapped, and Try Giving (in the center) highlighted
3. "Separate Bullets for Other Pledges" (see below)
Three versions of this page were randomly presented
Statistics on Google Analytics: This records only 'pressed any button' (any pledge) as the successful outcome.
From shared image from Google Analytics:
'Experiment sessions' (observations) by treatment (as labeled on Google Analytics shared image):
Original: 2588
Pledge before Try Giving: 2686
Separate Bullets for Other Pledges: 2718
Total: 7992 sessions (=2588+2686+2718)
3. Where is the data stored ... [noted above]
The "separate bullets for other pledges" seems to have been the most successful, with an 0.49% higher (percentage point) incidence rate than the 'Original', i.e., a 22% higher rate of pledging (2.69 vs 2.20).
These differences seem unlikely to be statistically significant in a conventional sense. Still, Google analytics (presumably a reasonable Bayesian) model states an 80% chance that this is the best treatment, and this seems useful and informative.
The "Pledge Before Try giving" treatment performed substantially worse than the original.
Downloading the 'Analytics data' behind the above graphs, we see:
This suggests it is very reasonable to think that 'Separate Bullets' is substantially better
a 2.5% chance that 'Separate Bullets' (SB) has an 18% (or more) lower conversion rate than 'Original'
a 22.5% chance on SB being between 18% worse and 4% better
a 25% chance of SB being 4-20% better
a 25% chance of SB being 20-36% better
A 22.5% chance of SB being 36-76% better
A 2.5% chance of SB being more than 76% better
We can also combine intervals, to make statements like ...
a 50% chance of being 4-36% better
a 50% chance of being 20-76% better
For 'Pledge before...' (PB) we can state, e.g.,
PB has a 75% chance of being at least 11% worse than Original
and a 50% chance of being at least 23% worse than Original
Perhaps giving people more options makes them indecisive. They may be particularly reluctant to choose a “relatively ambitious giving pledge” if a less ambitious option is highlighted.
This could also involve issues of self and social signaling. If the 'main thing' to do is a 10% pledge (as in "separate bullets"), then this may seem a straightforward way of conveying 'I am generous'. On the other hand, if the 'Further pledge' is fairly prominent, perhaps the signal feels less positive. And if the '1% pledge' is made central, 10% might seem more than a necessary signal.
The "pledge before try giving" may perform the worst because it makes the 'Try Giving' pledge a particularly salient alternative option. (In contrast, the "Original" at least makes 'The 10% Pledge' the central and the middle option.)
I am assuming that the 'outcome being measured here' is whether the person 'clicked on any giving pledge'; this is what Luke has conveyed to me
I assume this is 'conversions ever from this IP', and 'sessions' represents 'how many different IPs came to the treatment'. If it's something else (e.g., each 'session' is a 'visit' from an individual), this could reflect these people converting in fewer sessions but not necessarily being more likely to convert overall. Even if this is 'by IP' the alternative interpretation 'not converting now but maybe later' may still have some weight if people are entering through multiple devices.
See public 'open science' work in progress and preliminary results
April 2021 mailing addressed to ICRC's existing donor base of:
active (regular donors),
warm (last donation between 12+ and 24 months ago) and
sleepy (last donation 24+month ago) donors.
169,919 donors (active donors (58,330; 34,14%); warm donors (48,672; 28.49%); sleepy donors (62,758; 36.73%))
Mailing goes out to donors in Switzerland (all parts: German, French and Italian)
Academic-linked authors: David Reinstein, Josh Lewis, and potentially others
2. 10% of the time people watched the full video! \
E.g., 5% of people chose to watch the entire 13 minutes of _
For the , older people and men were more likely to choose to continue watching
See actual texts for design and timing
Bayesian Test of Difference in Proportions (as in ), with an informative beta distribution for the prior over the incidence rate in each treatment, with a parameter based on the incidence rates for similar campaigns in the prior 2 years.
Where situations arise that have not been anticipated in our preregistration and pre-analysis plan, we will try to follow the Don Green lab unless there is a very strong reason to deviate from this, which we will specify.
These are organized and linked .
Note: we present some more in-depth analyses and graphs in the Quarto , along with a code and data pipeline
In the graph below (pasted from the Quarto ), we show these year-on-year upticks in context.
These results may only apply narrowly to the GWWC pledge case, and even here, we have some . However, it loosely suggests that when making a call to action, it may be most effective to present the most well-known and expected option most prominently, and not to emphasize the range of choices (see further below).
Getting people to take the GWWC pledge may be seen as an important outcome on its own. It on getting people engaged in the Effective Altruism community and other EA activities, such as EA career impact decisions.
Tendency to choose 'middle options'
GWWC has three distinct pledge options, as shown
(link from October 2020).
Statistics are available on Google Analytics/Optimizely. Reinstein has access to this and, is planning to input into R for more detailed analysis, to be reported in the .
Equal likelihood of
The version presented stayed constant
See for details on data extraction from the interface
Dillon: there is possibly a more sophisticated approach to this than what Google is doing ... the better prior is an 'empirical Bayes' approach (but it may be controversial). See to empirical Bayes
Our 'posterior' probability thus infers that
Original
0%
0%
0%
0%
0%
Pledge Before Try Giving
-50%
-33%
-23%
-11%
18%
Separate Bullets For Other Pledges
-18%
4%
20%
36%
76%
DONATE TODAY: your donation can supply food parcels to a Syrian family
DONATE TODAY: your donation can supply food parcels (ca. 17CHF/parcel for one month) to a Syrian family
DONATE 50CHF TODAY: your donation can supply 3 food parcels (ca. 17CHF/parcel for one month) to a Syrian family
DONATE 150CHF TODAY: your donation can supply 9 food parcels (ca. 17CHF/parcel for one month) to a Syrian family
DONATE 50CHF TODAY: your donation can supply food parcels to a Syrian family
DONATE 150CHF TODAY: your donation can supply food parcels to a Syrian family
DONATE TODAY: - With 50CHF you offer 4 Hygiene kits to Syrian families - With 100CHF you offer 14 school kits to Syrian students - With 150CHF you offer 9 Food parcels to Syrian families
"Charity Elections" (in schools): trials in preparation, extensive consultation
80000 hours: trials in progress, preparation, and analysis; some work joint with Rethink Priorities. Note: we have limited permission to report on these trials
The Life You Can Save: Trials run and in preparation, limited permission to share
High-Impact Professionals (HIP): Advising on surveys and approaches
GiveWell (discussions and consultation)
... And other organizations that didn't want us to report on this publicly
Reinstein, FB ads tied to fundraisers.
Note: this information is subject to change; updated ~ Apr. 2022
My costs have been:\
about $0.01 per impression
about $0.50 - $1.20 per click
Targeting at Universities ... Facebook's estimates
The estimated cost per impression (?‘reach’) and per click varies with the targeted audience. In general, narrower is estimated to be more costly. I think this is about ‘a larger audience allows FB to serve the ads to a larger number of people who tend to be click-happy’ Some data points:
For Oxford, ‘In College’, living in the UK: They claim we will get 4-18 clicks per day for $50 per day over 2 days (29 Mar 2022 check on FB ads manager)
If I put in Birmingham instead I get a fairly similar figure.
If I remove the only-one-university narrowing, it gets cheaper. They claim I’ll be able to get 86-250 clicks per day for the same cost …
EA groups (employees) within Meta
If you run a lot of ads FB will assign you external consultant helpers. They are somewhat helpful, but they don't seem to know everything.
28 Nov 2022, Zoroob:
The three-part series of virtual webinars provides nonprofits with advertising training and best practices around how to use Meta technologies to further their missions:
Catholic Relief Services/DonorVoice experiment
See public 'open science' work in progress and preliminary results HERE
'Thanksgiving email' trial run in 2 subsequent years
Super-overoptimistic information (2018), Moderately overoptimistic information (2019)
Note 7 Mar 2023: I just started this page, it is far from complete
#other-marketing-implementation-resources: Including (new) EA-aligned marketing groups
Which links THIS spreadsheet
Much of which is embedded into THIS Airtable view as well (which will have some further comments on the relevance, as well as organizations that are not-so-EA related, with discussion)
innovationsinfundraising.org - "The IiF wiki collects and presents evidence on the most successful approaches to motivating effective and impactful charitable giving, and promotes innovative research and its application." This precedes and is partially integrated into the current resource
We are an academic collective and research non-profit, dedicated to providing public communication campaigns with cutting-edge research and rigorous tools for message development.
"Crowdsourcing" ... Recent research suggests that regular people can often be far more effective than experts at predicting which messages will best resonate with others in their community.
On Adaptive design/sampling, reinforcement learning...
The challenge is that the “space” of messages for campaigns to decide between is enormous — there are very many things a campaign could say and many different ways to say them. Unfortunately, research shows that relying on theory and expert guidance about “what works” when designing campaign messages is unlikely to be effective by itself, because “what works” is difficult to predict and can change dramatically across contexts (e.g., see [1], [2], [3], [4]).
-->
Efficient message search. We design research pipelines that allow campaigns to explore the large space of potential messages more efficiently, and to quickly zero-in on the most impactful messaging strategies. Our methodology is based on a combination of large-scale adaptive online survey RCTs, Bayesian machine learning and surrogate metrics.
EA seeks to amplify its impact through movement-building. Organizations like 80,000 Hours and CEA are putting substantial resources into developing and expanding the EA community. Building EA groups has been at the core of this agenda, especially in elite and influential places (such as top universities). Key aims include 'creating highly engaged EAs' and encouraging people to pursue impactful careers.
Our Resource Airtable (in-progress)
Currently, university EA groups operate in conjunction with the Centre for Effective Altruism, but with high levels of autonomy. There is only limited collaboration between groups. Such collaboration could allow them to achieve economies of scale and scope, run more systematic and powerful trials, and share insights and methods that increase student engagement.
The EAMT hopes to help coordinate this, consolidate the evidence, and provide accessible tools to newly-formed groups. We want to help avoid repeating errors and 'reinventing the wheel' each time.
The efforts and experience of individual EA groups can provide contextual evidence and insights. The EAMT aims to aggregate this knowledge, find generalizable principles, and disseminate this to the wider EA community. We are focused on meaningful medium-term outcomes, e.g.:
Membership and participation in EA organizations, and markers of post-university involvement
How career plans are impacted (focusing on particular programs and paths)
How research and discourse at universities can be influenced
The programs below also aim for generalizable principles; e.g., their 'starter toolkits' are implemented across a range of cities, universities, and settings.
CEA has discontinued its focus on university programming, passing funding and efforts on to Open Philanthropy. However, CEA is still involved in promotion through the University Group Accelerator Program (UGAP), which offers guidance and resources to newly formed groups. Furthermore, CEA's Community Building Grants (CBG) Program helps develop national and city-based groups (outside of universities).
UGAP's Outreach Handbook may be the These have been summarized from different data points; some formal testing, some anecdotal, and some intuitive.
CBG focuses on supporting city groups, providing grants to support their activities and resources to help with expansion. These resources and support systems currently lack data supporting EA community building. (The CBG Support Survey identified this as a major bottleneck; we hope to collaborate to help them improve this.)
The EA Group Organizers Survey (2020) is a collaboration between CEA and Rethink Priorities. It analyzes the changes in EA groups yearly, with two main components:
The growth and composition of EA groups and their activities
The opinions of the group's status from the organizer's point of view
The first component gives insight into priorities and progress. The second can help guide our research and provide insight into the tools required by group organizers to increase group interaction and outreach.
See especially: "Events and outreach"
The University Organizer Fellowship provides funding for part-time and full-time organizers helping with student groups focused on effective altruism, longtermism, rationality, or other relevant topics at any university (not just focus universities). This has replaced CEA's Campus Specialist and Campus Specialist Internship programs.
The Century Fellowship, a selective 2-year program that gives resources and support (including $100K+/year in funding) to particularly promising people early in their careers who want to work in areas that could improve the long-term future. (Intended partially for particularly strong Campus Specialist applicants.)
80,000 Hours is actively targeting university students and offering them guidance on high-impact career paths. (see private Gitbook, if you have access)
There are some further initiatives in this area but most of the material cannot be shared at the moment (see private Gitbook).
In this section, we are putting together documents, trials, and knowledge currently being gathered by different EA groups. As we increase our collaboration with these groups, these trials, ideas and documents will become integrated with the Gitbook and EAMT's work, forming a basis for future work and testing.
This is our basic understanding of the processes used to draw in new members to EA university groups and fellowships, and how members progress through different stages of engagement. Each stage gives us grounds for testing through the different variations of these approaches. This is not just about testing which methods work for attracting the highest number of new members (i.e., which 'call to action' to use at activity fairs, etc), but also increasing engagement and developing high-level EAs (i.e., fellowship program alternatives, discussion group topics, etc).
(Above: a preview of funnel map; see Google Doc for full description and work in progress)
Awaiting response from Stanford EA.
Currently limited to private Gitbook.
Currently limited to private Gitbook.
EA Israel's Strategy Document This discusses their strategy in-depth. A lot of their findings are not specific to Israel or country-wide EA groups. Useful as a resource for EA groups.
Useful findings will be synthesised and integrated here in the future.
We have been independently contacting organizers that are known to be actively seeking to test outreach methods, and also publicly via a callout post on the EA Forum. An important aspect of the work here is to bring together people who are active in this space but working independently. The airtable below presents our current (non-exhaustive) list of groups or organisations that have relevant knowledge (strategy documents, marketing guides, etc), or have done some form of independent testing.
Thus, we hope our efforts will be valuable to these initiatives and groups, by providing and sharing evidence on successful approaches to increasing engagement.
Note that the survey does not collect data from the group's *members*, although they do ask about the overall numbers of people who engaged with each group.
Next to consider: "Opportunities" (to 'do', measure, learn) ... we should make an inventory here
In case you don't like writing in this Gitbook, I created THIS GOOGLE DOC to discuss
Practical issues with running experiments and trials and gathering data on these and making reliable inferences -- see subsections below
(next subsection: )
: Design and implementation of marketing; EA aligned and savvy.
See
JS Winchell offers advice on how to do this right and leverage certain grants and do good marketing.
JS Winchell has started an agency called ""; they are working to implement and measure the impact of EA marketing for the highest-value clients.
provides free tech support and development to organizations in the (EA) community.
Testing 'implementation strategies'
GWWC web site at point of email signup
Email lists
immediate: subject headers w/ 'open rates' as dependent variable
medium-term: all outcomes tied to email
Facebook; But the targeting algorithm may frustrate randomization. (see .) Can it be switched off?
See
This is helpful if the important outcomes can be tracked by ZIP code/post code/address.
Online display advertising
Google search
YouTube
Facebook (presumably)
We can use some of the same strategies as above to test "rich content", i.e., short or even long talks, book chapters, podcasts, and so forth.
Question: If our aim is to change the culture of giving in general, what kind of people should we be targeting?
Influencers (People with lots of social influence)
Low-hanging fruit (i.e., people who are naturally predisposed towards effective giving, pledging, & EA)
Email from JS:
The YouTube team holds quarterly workshops to explain how best to build and use your organic (not paid) YouTube channel. Based on previous discussions it sounds like this is something that might be of interest to your orgs.
Note: This is aimed at beauty and fashion brands but I'd imagine 80% of it would apply to GWWC/80k/1ftw
Agenda for the workshop:
Explain why YouTube is crucial for your brand identity
How to claim your narrative on this platform
Reach and engage new and existing audiences through content
A review of channel best practices
Enhance your channel's search and discovery potential
Develop an always on strategy
Research advancement manager:
Meta has just released a recorded series of videos () to help non-profit organizations meet their year-end fundraising goals. Some of these materials may also be helpful for researchers using Meta ads (e.g., materials on designing effective ad creatives), so I am passing the info along. Blurb below.
: Get started with Meta advertising with our today.
The session also features that enable donation transactions within the Facebook app.
: Learn what great nonprofit creative can look like with best practices from Meta.
Consider saving our to learn more about the five key creative considerations that apply to cause-driven campaigns.
: Introduce yourself to measurement best practices on Facebook and Instagram! Afterwards, explore split testing, lift measurement, and the experiments tool, on our .
to view resources to help your organization drive more positive change, and do more good.
These and other videos for non-profit advertisers can be found on the "."
(collecting data)
Seems particularly useful but access is limited; they hope to make it more generally available some thing like mid- spring 2023.
(a list of orgs in the 'EA effective giving' space; private gitbook atm)
Paid participants may allow richer feedback (see )
The 'mysterious sauce' ... JS knows about ()... we don't always have a "theory" but it might be meaningful.
See also
Idea: Compare different outreach methods on the basis of "cost per pledge" (or per "whatever-metric-we-use"). (Outcomes: ... & ... )
Public lists of political donations (e.g, )
Search/visiting webpages about charity effectiveness/merit (e.g., Charity Navigator)
Register your interest for future workshops .
JS: My YouTube video best practices are here, but note YouTube is a very different platform than FB/IG (sound is on 98% of the time, no scrolling, you have at least 5 seconds to hook their attention, ads are much longer)
Guidelines and resources on how to get ads and marketing going, how to finance it, tips on how to do it right
How to get data from trials of Facebook ad
Go to "the reporting suite in Meta ads manager"
2. Specify some filters:
This gets us the screen below
3. Specify the date range.
5. Export simple results for Campaigns
Click 'Reports' ... upper right.
We can 'create a custom report', which saves this for later tweaking, or merely 'export table data'. I will do the latter for now:
Note: I chose CSV and do not include summary rows, to avoid confusion later.
Now I import this data into R (I usually use code but let's do it the interactive way for illustration)...
It seems that the option 'include summary row' was probably not wanted here, and that row with blank 'campaign name' could cause confusion.
It seems to have removed the "bid strategy" column, and added 'reporting starts' and ...'ends' from the filter. Otherwise, everything else seems the same as in the ad manager view, although some labels have changed.
We see three tabs
Campaigns
Ad sets for 1 campaign
Ads for 1 campaign
Campaigns
Here we have 7 campaigns, each with separate budgets, and start and end dates (although these mainly overlap).
It looks like some campaigns were set up for direct comparison or "A/B" perhaps, with the exact same budgets and end dates, and similar names:\
Ad sets
Here, there are 52 total 'ad sets' across all campaigns.
I'm going to export this as a csv too, in case it's useful.
Ads
There are also 52 "ads"; it seems in this case, one per ad set:
The information in the 'ads' table seems the same as in the 'ad sets table' ... other than a link to preview the ad content itself (which I don't seem to have access to atm).
Add section: How to set up GA
Some key 'flows and tips'
**'**Home'
'behavior', 'site content', 'all pages'
remember to set date range!
Acquisition, all traffic, channels: here 'social' (probably) tells you who came from Facebook etc
Acquisition, all traffic, Source/medium drills down into this
DR: I'm not sure how to get 'all the data', but I have been able to get data on, e.g.,
a set of outcomes,
over a set period of time, (a particular month and the same month in the previous year)
broken down by another feature (by city)
Then search and select your desired ‘metrics’ (outcomes) of interest. “Users” and “sessions” seem pretty important, for example.
Next you can break this down by another group such as “city”. You can put in 'filters' too, if you like, but so far I don't see how to filter on outcomes, only on the dimensions or groups.
I don't know an easy way to tell it to “get all the rows on this at once.” but if you scroll to the bottom you can set it to show the maximum of 5000 rows.
Next, scroll up to the top and select export. I chose to Export it as an Excel spreadsheet., as this imports nicely into R and other statistical/data programs.
We were able to do this in two goes, but for larger datasets this would be really annoying. I imagine there is some better way of doing this., maybe a way of using an API interface for Google Analytics to just pull all of this down.
A partial workaround fix is to do a ‘filter’ to discard rows you don’t need… click ‘advanced’ at the top and…
Here “Effective Giving Guide Lead Generation campaign … ran late November 2021 - January 2022" (Careful in specifying the dates; the interface is weird)
After specifying these dates, more information comes up in the basic columns:
suggests there may be substantial delay. But does this only apply to sites with a great deal of traffic?
After logging in and selecting 'all domains'...
Select 'customization', 'custom reports', 'new custom report'
Below, we give one example from a relevant context, illustrating (with screenshots) what choices you might make, what it would look like, and how to implement it.
See also: Facebook split-testing issues and #videos-facebook
Updates/general advice: (Sep 2022) To do 'any good tracking and optimization through 'Facebook, you should set up the Meta Pixel and Conversion API as soon as possible.
You may want to jump to the #optimizing-and-pixels (WIP) section.
"Meta Business Suite"(https://business.facebook.com/) is the starting point of your ad campaign. If you have a Facebook Business account, you should have a "Meta Business Suite":
Next, click on "Ads manager" (See the megaphone on the left).
When creating a new "Traffic campaign" ('cold traffic campaign' referenced HERE) there are a lot of options to help you optimize your delivery while minimizing your expenses.
You need to opt-in to these tools by ticking "create A/B test" and "Budget Optimization" on the first page of your "ad campaign manager." Since there is no downside (we would like to learn which ad design works best), we decide to opt-in to each of these.
Budget optimization is closely related to the choice of the target group. In general, the larger the target group, the cheaper it becomes to reach a certain amount of "link clicks".
Suppose we wish to create a targeted ad for a particular Facebook audience. For example, we might wish to put an ad...
in the 'feed' of US Americans who are interested in charity or volunteering or philosophy
giving them a link to a page encouraging them to learn about EA
We can use the "schedule and duration" function not only to automate the timing of our campaign, but also to estimate its cost. For example, we assume that we need 800 participants to click-through to start the 20 fundraisers (i.e., a rate of 2.5%).
Below, we see that FB estimates 172-497 link clicks per day for 10 Euros per day for (a different_ case.
You can specify
Demographics
Interests
Behaviors
"Include" seems to be the default when specifying these ... it 'expands the audience'. You can click 'narrow further' to constrain the audience.
We have some evidence that narrower targeting helps. An obvious candidate is
The next big choice is 'where do you want to drive traffic?'. You'll enter more details about the destination later.
Since we want people to click our web app, we chose "website".
We may have several versions of the ad we want to try out, and we want Facebook to iterate towards the one that is more successful using their algorithm. Ideally, we would like to learn as much as we can about 'which ads perform better on which audiences'.
We can set up Facebook's ("meta") algorithm to dynamically optimize 'over which will get the most clicks.'
Where do we actually specify, enter, and style our ad content?
Finally, we have to decide which delivery we want to optimize.
We may want the ad that gets the most "conversions traffic to our page". Therefore, we choose the option "link clicks".
However, we might instead want FB to optimize the ad presentation in terms of which ad not just leads to the most 'clickthroughs' but leads to the most "conversions" or some other action taken on our page To do that we need to set up a "meta pixel". See #optimizing-and-pixels
DR: In my past experience, you ended up paying Facebook based on the number of "clicks" you got not simply on how long your ad was up. But it's probably a combination of these, and there are probably different pricing plans. You can tell Facebook to put a limit on either of these do not go "over budget". Facebook will aim to spend your entire budget and get the most link clicks using the lowest cost bid strategy.
Currently EUR 315 is the max for new users ... but for our present pilot we may want less than this (check: how much do we expect to pay for 800 clicks, let's split this up into ... first 100 clicks, next 300 clicks,.. to see if its going OK )
Finally, you enter the third and last page of the ad creation process. Here you have to verify your ID and Facebook page and choose the actual design of your ad versions. ["of which the most important one is whether you want to have a video or single image." (?) ]
The last step before publication is to specify the destination for your campaign.
We chose a website and simply copy the URL into the mask to make sure the ad is linking people to the right destination.
The pixel includes content from Facebook that needs to be integrated into your website/page of interest. (To do: link instructions for this).
One simple way of doing this: "Events setup tool"
Once you are in the ads manager for an ad, go to the 'Events Manager':
"Add events", choose "from the pixel"
"Events setup tool"
Put the URL for your site in and 'Open website'
As seen below, this opens our page, and show what things have already been associated with a Pixel. Here the "create fundraiser" button on this page has been associated with a button on this page with the "Initiate Checkout". (We use default names Facebook is familiar with, even though there is no 'checkout' in this case).
("Facebook Pixel Helper" extension in Chromium might be helping here, but I'm not sure how).
For example, I could click 'who are we' on a page and associate it with 'view content'
I could 'add a value' to this, if it makes sense.
Can I use this later to have FB optimize for 'net value' of a user generated on the page? This might be a useful way to assign greater importance to certain things, even if they aren't actually monetized.
After this 'finish setup' ... it gives you the chance to see what you have asked it to do and confirm or cancel it.
Once you have nice pixels set up, you can use this in helping Facebook decide which versions of ads to serve, which audiences to serve them to, etc. You set up your ad, define an objective etc...
Here we're choosing 'initiate checkout', which we defined as clicking on a 'create fundraiser' button on the first page of our site (early in the funnel)
The warning below might not matter as we
The warning below might not matter as we haven't had our page up for a while. But we have also been told elsewhere that before you can get the ad to optimize for conversions ... you first need to have the pixel set up and the ad running, optimizing for views. So this might still be a concern.
I assume that the same 'conversions' target defined above is used in optimizing the 'dynamic creative' if you turn that on.
The next step is to select "Create a campaign" and choose an "objective"... the interface gives you some idea of what these aim for:
From a recent relevant experience in our group's context...
Write a captionYou can specify
"Include" seems to be the default when specifying these ... it 'expands the audience'. You can click 'narrow further' to constrain the audience.Don't forget to use the search tool within 'browse' to find ways to do careful targeting Exit with⌘↩
Don't forget to use the search tool within 'browse' to find ways to do careful targeting
During this process, you can see a concise statement of your choices, and the estimated audience size further up on the page:
"Track new button" lets you see what click options you could associate with a pixel.This highlights clickable things you can do this with. ('Create fundraiser' is not highlighted, probably because it's already been assigned).
Define your goal as 'conversion', and define what 'conversion' corresponds to in terms of pixels:
Facebook tracks people for a while. So in optimizing, you can change 'what time period of outcomes it attributes to which (version of the ad)':
For a trial to yield insight, we need to be able to track and measure meaningful outcomes, and connect these to the particular 'arm' of the trial the person saw ... (if they saw any arm at all)
In this section we discuss how to see the results of your promotions and trials, and how to access data sets of these results that you can analyze.
To test content in more depth than an A/B trial permits
Better control over 'who is participating' and how much attention they are paying
Things more towards 'basic background research'
Closer to a 'representative sample'
Prolific participant recruitment: Created specifically for (academic research). Our impression is that this is among the highest quality panels, although there is some controversy over this.
CloudResearch: CR approved Mturk
CloudResearch: Prime Panels
Crowdflower Positly: https://www.positly.com/ Qualtrics (panel) Lucid
Dyndata
For each proposed/ongoing/past trial, we should report the following minimal details, with links (proposed template)
For each proposed/ongoing/past trial, let's try to report the following minimal details, with links (proposed template) If you don't have time and you have another clear presentation of most of this, please link or embed it.
Please keep your answers brief -- if you want to give more detail (which is not necessary) please link a later section or external page. _
Short version of this template as Google Doc HERE (link copy-opens a new version for you to work in)
Firstly, what is this promotion trying to do (e.g., 'encourage signups for giving pledges')?
But more importantly, what are you trying to learn here... What might you have a better understanding of after the trial than you did before the trial?
E.g.,
Specifically:
Does the opportunity we offer to sign up for an 'accountability partner' increase or decrease the rate at which people DO XXX activity?
Does it lead to greater overall XXXlinked donations per visitor over the next 1 year interval?
Generally:
Does 'social accountability' help to encourage XXX activities and promises and the fulfillment of these? Does the 'fear of being held accountable' discourage people from making commitments?
(Optional: brief on background theory and previous evidence)
You can enter more than 1 person here, including an external organizer (like JS Winchell), but ideally, also someone inside the organization.
Add 'academic/research lead' here if there is one
The present Gitbook/and a nested Github repo folder could be ideal. Please give a precise link so others could access it.
(Is it on a web page, a google advertisement, a physical mailing, etc)
Who will be targeted or who do you expect to be part of the trial?
(Somewhat optional) How many people (or 'units') do you expect to be involved (median guess)?
(Optional): How many do you expect will have a 'positive outcome' (e.g., a 'conversion')?
Description, link exact language/content if possible
At what level is it varied? (individual visitors, postal codes, days of the week, etc)
How are treatments assigned ('blocked randomization', 'adaptive/Thomson sampling', etc.)
If you are using a 'set Google, Facebook etc algorithm', just input the settings you used here, and/or link the (Google, FB, etc) explanation
How many/what shares are assigned to each treatment?
What measures (outcomes, other features) will be collected?
When and how
Where will the data be stored, who will have access
Planned analysis methods, preregistration link, IRB link, connection to other projects and promotions
Did it go as planned? Any departures? (Timing, randomization, design changes, etc)
How much/what data was collected? How many observations?
Where is the data stored (also link/adjust the above), who has it, and under what conditions?
"Partners and stakeholders opinions": were they happy with the trial? Did they seem to think it was a success?
Simplest statement (e.g., "3% donated in the treatment versus 2.2% in the control, with an average amount raised of $4.3 in the treatment and $3.1 in the control')
Preliminary interpretation, with statistical test if possible (e.g., 'google Optimize states an 80% chance that the treatment outperformed the control', a Fisher's exact test yields a p=0.06 that a positive donations was more likely in the treatment than the control)
"Full analysis"
Who/what when will it be done?
Link to 'where' it will be done (both the 'follow up the pre-analysis plan, and the full write-up, if applicable)
Possibly: Briefly characterize the overall confusions/state of analysis here (state the date last updated)
Feeding synthesis and meta-analysis
Which generalizable questions does this inform?
Is data sharable? Key comparable outcomes?
What other work/trials does this relate to?
State of meta-analysis
Facebook's Ad Manager and Google Analytics often report results that seem to have discrepancies. Below, one particular case, and possible explanations.
Facebook: We have 50k+ unique impressions, and 1335 clicks
Google Analytics records only 455 page views, 403 users
And only about 20 doing any sort of Engagement like scroll or click (if we read it correctly)
JS: main reasons [DR: slightly edited[
1. "Do they click the ad and shut down before page comes up?" Yup! Closing the page before the redirect fully loads. Facebook will be as generous as possible with their click reporting.
2. ... If a user clicks on the FB ad twice within 30 minutes, then Google Analytics would record that only as a single user and a single session.
3. If a user has JavaScript disabled or doesn’t accept cookies, then Google Analytics doesn’t track.
Leticia at Facebook: can be mistaken clicks, this is common.. need a pixel to fix this ..., can change it to 'landing page view'
You may want to see or export crosstabs of one outcome, user feature, or design feature, by another. Sometimes you just want to see these quickly, but this might also be a way to extract the 'raw data' you wish to analyze elsewhere.
Start new pivot table
From within Ads Manager From 'ads reporting' (3 Aug 2022 updated interface)
Click "Create Report" --> Pivot table
2. As before, make sure you've selected the right date range, and (redo) any relevant filters
Here I add a filter for 'campaign name ' contains 'general'. Because I'm specifically trying to pull down some information on 'which video people saw' in this group (which needs a special setting to access... as noted below)
3. "Customize pivot table" – "Breakdowns" ... the things you want this to disaggregate across (sums and averages within groups)
the 'campaigns', the 'ad names'
timing, demographics
Drill down to "Custom breakdowns", "Dynamic Creative Asset", to get it broken down by the text linked to the ads:
However, some breakdowns are not compatible with other breakdowns (maybe for privacy reasons?) For example, if I tick 'Gender' I cannot have it broken down by 'Image, video, and slideshow', at least in the present case ... (perhaps because it narrows down too few observations?)
4. "Customize pivot table" – "Metrics"
Select the things you want reported, and deselect things that are not interesting or irrelevant to this case (like 'website purchases') or numbers that can be easily computed on your own
Normally, I'd suggest leaving out the redundant 'Cost per Result' but it's probably good to have as at least one sanity and data check.
Other stuff like 'video play time' could sometimes be very relevant, but I'll leave it out for now
5. (Optional) Conditional formatting
This could also be helpful if you are using the Ads Manager tools in situ, but obviously this has no value for downloading.
6. Save report for later use, share
If you think the report is useful in-situ, you can also share a link
7. Export the data
As in #extracting-simple-results...
(or consider direct import into R using tools like the rfacebookstat
package)
Understanding how this tool works to test different versions of pages. GWWC Pledge page trial as first context
Mapping the key non-obvious features of running and analyzing these A/B trials using the Google analytics/optimize system.
Reporting and considering this in the context of the GWWC Pledge page (options trial)
Clicking on a particular 'experience' in the 'container'...
(if you have been granted read and analyze permission), will open the useful 'Optimize Report' (which Google explains here)
The overall start/end and 'sessions' are given first. What are "sessions"? The short answer: 'Sessions' are the number of 'continuously active' periods of an individual user. So individual users may have multiple sessions! (see #sessions-vs.-usersbelow). Here, there have been 7992 such 'sessions' over 81 days.
I am not sure where we can learn 'how many users there were'.
("View full chart" can give you a day-by-day breakdown of the number of sessions.)
The next section compares 'sessions' and 'conversions' by treatment, and does a Bayesian analysis. This seems the most useful part:
Above, the 'Separate block' (SB) seems to be the best performing treatment. Google calculates a 2.69% conversion rate for this (here, presumably the rate of people checking 'any' of the follow-on boxes).
Considering the Analysis, Google Optimize "uses Bayesian inference to generate its reports... [and] chooses its priors to be quite uninformed." The exact priors are not specified (we should try to clarify this).
But if we take this seriously, we might say something like ...
if our initial priors gave each treatment an equal chance of having the highest conversion rate ('being best'), and assumed a [?beta] distributed conversion rate for each, centered at the overall mean conversion rate ...
then, ex-post, our posterior should be that the SB treatment has an 80% chance of being best, our 'Original' has a 17% chance of being the best ...
Google also gives confidence intervals for the conversion rates for each treatment, with boxplots and (95%) credible interval statistics:
The grey bar for the baseline is mirrored in all rows. The 95% CI for the 'improvement over the baseline' is given on the right. But this is a rather wide interval. More informatively, if we hover over the image, we are given more useful breakdowns:
Although this does not exactly tell us the 50% interval 'improvement over the baseline' (this would need a separate computation), we can approximately infer this.
But fortunately it is reported in data we can download; see below "Download (top right)".
From that data, we get:
Original
0%
0%
0%
0%
0%
Pledge Before Try Giving
-50%
-33%
-23%
-11%
18%
Separate Block For Other Pledges
-18%
4%
20%
36%
76%
Our 'posterior' probability thus infers (assuming symmetry, I think) that we should put (considering odds ratios, not percentage points)
a 2.5% chance of SB having an 18% (or more lower rate of conversion than 'Original'
a 22.5% chance on SB being between 18% worse and 4% better
a 25% chance of being 4-20% better
a 25% chance of being 20-36% better
A 22.5% chance of being 36-76% better
A 2.5% chance of being more than 76% better
We can also combine intervals, to make statements like ...
a 50% chance of being 4-36% better
a 50% chance of being 20-76% better
We report on this further, for this particular case, under #basic-results-outcomes
There is some repetition (can we 'mirror blocks'?)
Above, even though the treatment has been assigned randomly (presumably a close-to-exact 1/3, 1/3, 1/3 split), the number of 'sessions' differs between the treatments ('variants').
Why? As far as I (DR) understand,
while each individual user (at least if they are on the same machine and browsing with cookies allowed) is given the same treatment variant each time...
the same users may 'end' a session (by leaving or being inactive for 30+ minutes), and return later, getting the same treatment but tallying another 'session'. This suggests that users in the "Separate Block" (SB) treatment are returning the most (but also see 'entrances' below).
The final section gives the day to day breakdown of the performance of each treatment, presumably, along with confidence intervals. This seems relevant for 'learning and improving while doing' but possibly less relevant for our overall comparison of the pages/treatments.
The 'Analytics data' gives us sessions and conversions by day and by treatment.
(Where no session occurs in a day for a treatment, it is coded as blank).
... this gives some other information, mainly having to do with the user experience.
"Unique page views" represent "the number of sessions during which that page was viewed one or more times." ... Recall "sessions" are periods of continuous activity.
"Entrances" seem potentially very important. According to Google:
Sessions are incremented with the first hit of a session, whereas entrances are incremented with the first pageview hit of a session.
In the present context, this suggests that the 'Separate block' page is inspiring users to come back more often, and to spend more time on average.
As noted, essentially: 'Sessions' are the number of 'continuously active' periods of an individual user
Analytics measures both sessions and users in your account. Sessions represent the number of individual sessions initiated by all the users to your site. If a user is inactive on your site for 30 minutes or more, any future activity is attributed to a new session. Users that leave your site and return within 30 minutes are counted as part of the original session.
The initial session by a user during any given date range is considered to be an additional session and an additional user. Any future sessions from the same user during the selected time period are counted as additional sessions, but not as additional users.
Discussion of issues in designing experiments/studies that are not specifically 'quantitative', but are important for gaining clear and useful inference
Academics usually try to make each treatment differ in precisely one dimension, these treatments are meant to represent the underlying model or construct as purely as possible. This can lead to setups that appear strange or artificial, which itself might bring responses it will not be representative or generalizable.
For example, in my '' (lab) work we had a trial that was (paraphrasing) 'we are asking you to commit to a donation that may or may not be collected. If the coin flips heads, we will collect the amount you commit, otherwise no donation is made'. It was meant to separate the component of the "give if you win effect" driven by the uncertain nature of the commitment rather than the uncertain nature of the income. However when we considered bringing this to field experiments, there was no way to do it without it making it obvious that this was an experiment or a very strange exercise.
When we consider an experiment providing 'real impact information' to potential donors, we might be encouraged to use the exact write-up from Givewell's page, for naturalness. However, this may not present the "lives per dollar" information in exactly the same way between two charities of interest, and the particular write-up may suggest certain "anchors" (e.g., whole numbers that people may want to contribute). Thus if we use the exact GW language we may not be 100% confident that the provision of the impact of information is driving any difference. We might be tempted to change it; but at a possible cost of naturalness and direct applicability.
There are very often tradeoffs of this sort.
In the present context, we have posted about our work, in general terms, on a public forum (). Thus the idea that ‘people are running experiments to promote effective giving and EA ideas’ is not a well-kept secret. If participants in our experiments and trials are aware of this it may affect their choices and responses to treatments. This general set of problem is referred to in various ways, referring to different aspects of this; see 'experimenter demand', 'desirability bias', 'arbitrary coherence/coherent arbitrariness', observer bias (?), etc.
Mitigating this, in our context, most of our experiments will be conducted in subtle ways (e.g., small but meaningful variations in EA-aligned home pages), and individuals will only see one of these (with variation by geography or by IP-linked cookies). Furthermore, we will conduct most of our experiments targeting non-EA-aligned audiences unlikely to read posts like this one. (People reading the EA forum post are probably ‘already converted’.)
(To be fleshed out in more detail)
Universe (population) of interest, representativeness
Design study to measure 'cheap' behavior like 'clicks' (easier to observe, quicker feedback) versus meaningful and long-run behavior (like donations and pledges)
attribution issues
attrition issues (also see the quantitative sections)
Choice of impact measure/metric (also see the quantitative sections)
Geographic blocks versus individuals How to block/stratify
See for a mainly theoretical discussion of this
for how to do split testing on Facebook, and the limits to traditional design given their setup
I added a few features I thought might be interesting or useful. Was anyone drawn in to pledge? When did each campaign start/end (doublecheck)? How many unique link clicks?
Discussion of blocking/randomizing treatments by post/zip code or other region, allowing us to more accurately tie treatments to ultimate outcomes
Measurement needs are varied and come with a variety of limitations, e.g., data avail-ability, ad targeting restrictions, wide-ranging measurement objectives, budget availability,time constraints, etc
Kerman et al, 2017
In many contexts, the route to a meaningful outcome (e.g., GWWC pledge) is a long one. Attribution is difficult. An individual may have been first influenced by (1) YouTube ad while seeing a video on her AppleTV, and then (2) by a friend's post on Facebook, and then finally moved to act (3) after having a conversation at a bar and (4) visiting the GWWC web site on her telephone.
The same individual may not (or may) be trackable through 'cookies' and 'pixels' but this is already very limited and imprecise, and is being made harder by new legislation.
"Geographic targeting" of individual treatments/trials/initiatives/ads may help better track, attribute, and yield inference about 'what works'. E.g., we might do a 'lift test':
select a balanced random set of US Zip codes for a particular repeated YouTube ad promoting GWWC, the "Treated group"
compare the rate of GWWC visits, email sign-ups, pledges, and donations in the next 6 months from these zip codes relative to all other zip codes. (Possibly throwing out or finding a way to draw additional inference from zip codes adjacent to the treated group)..
We could also do multi-armed tests (of several types of ad or other treatment, with a similar setup as above)
There are a few well-known and researched approaches: From Kerman et al, 2017 (emphasis added)
Geo experiments (Vaver and Koehler, 2011, 2012) meet a large range of measurement needs. They use non-overlapping geographic regions, or simply “geos,” that are randomly, or systematically, assigned to a control or treatment condition. Each region realizes its assigned treatment condition through the use of geo-targeted advertising. These experiments can be used to analyze the impact of advertising on any metric that can be collected at the geo level. Geo experiments are also privacy-friendly since the outcomes of interest are aggregated across each geographic region in the experiment. No individual user-level information is required for the “pure” geo experiments, although hybrid geo + user experiments have been developed as well (Ye et al., 2016). Matched market tests (see e.g., Gordon et al., 2016) are another specific form of geo experiments. They are widely used by marketing service providers to measure the impact of online advertising on offline sales. In these tests, geos are carefully selected and paired. This matching process is used instead of a randomized assignment of geos to treatment and control. Although these tests do not offer the protection of a randomization experiment against hidden biases, they are convenient and relatively inexpensive, since the testing typically uses a small subset of available geos. These tests often use time series data at the store level. Another matching step at the store level is used to generate a lift estimate and confidence interval.
Youtube ads
Facebook ads
USA
zip codes
Australia
We still mahy be able to make valuable inferences, under specified conditions, through 'difference in difference', 'event study', and 'Time based' approaches. We consider this in the next section: Difference in difference/'Time-based methods'
The main point
Facebook serves each ad variation to the people it thinks are most likely to click on it.
Thus, in comparing one ad variation to another... you may learn:
"Which variation performs best on the 'best audience for that variation' (according to Facebook)"
But you don't learn "which variation performs better than others on any single comparable audience."
Update 4 Oct 2022: We may have found a partial solution to this, with ads targeting 'Reach' rather than optimizing for other measures like 'clicks'. We are discussing this further and will report back.
Researchers are interested in running trials using Facebook ads. However, inference can be difficult. Facebook doesn't give you full control of who sees what version of an advertisement.
With A/B split testing etc: They have their own algorithm, which presumably uses something like Thomson sampling to optimize for an outcome (clicks, or a targeted action on the linked site with a 'pixel'). Statistical inference is challenging with adaptive designs and reinforcement learning mechanisms. As the procedure is not transparent, it is even more difficult to make statistical inferences about how one treatment performed relative to another.
Segmentation and composition of population: Facebook's 'PageRank' algorithm determines who sees an ad. I don't think you can turn this off.
We haven't found a way to be able to set it to "show all versions of an ad to comparable populations"
(And even if you could, it would be difficult for you to specifically describe "which population" your results pertain to.)
Abstract .... While effective, this geo-based regression (GBR) approach is less applicable, or not applicable at all, for situations in which few geographic units are available for testing (e.g. smaller countries, or subregions of larger countries) These situations also include the so- called matched market tests, which may compare the behavior of users in a single control region with the behavior of users in a single test region. To fill this gap, we have developed an analogous time-based regression (TBR) approach for analyzing geo experiments. This methodology predicts the time series of the counterfactual market response, allowing for direct estimation of the cumulative causal effect at the end of the experiment. In this paper we describe this model and evaluate its performance using simulation.
: How to design the 'content' of experiments and surveys to have internal validity and external generalizability
: How to set up trials to have comparable groups
: Adjusting the treatments and design as you learn, to 'get to the highest value in the end'
: How to make inferences from the data after you have it (and plan this in advance)
See
Rethink Priorities notes (some are works in progress)...
How many observations, how to assign treatments, etc.
Todo: Integrate further easy tools and guides, including those from Jamie Elsey
Drawing from Lakens' excellent resource:
You are considering a new and an old message.
Suppose you are a ‘believer’ … your prior (light grey up to) is that ‘this new message nearly always performs better than the control treatment’
Suppose you observe only 20 cases and the treatment performs better only half the time. You move to the top black line posterior. You put very little probability on the new message performing much better than the control.
Now suppose you have the ‘Baby prior’, and think all of the following ten things are equally likely
less than 10% of people rate the new message better than the control
10-20% of people rate the new message better than the control
…
… 50-60% of people rate the new message better than the control
...
90-100% of people rate the new message better than the control
You run tests on 20 people, and you get 15 people preferring the new message.
Now you update substantially. From some calculations (starting from Lakens' code, pbeta(0.65, aposterior, bposterior)
) you put about an 80% posterior probability that the new message is preferred by at least 65% of the population. (And only about 1.5% probability on the control being better)
So if I really ‘am as uncertain as described in the example above’ about which of two messages are better (and by how much)...
... then even 20 randomly-selected people assessing both messages can be very informative. How often does this ‘strong information gain’ happen? Well, under the "baby prior", you would get information at least this informative in one direction or the other about half the time.
See ":"
The framework and R package seems very helpful. I (David Reinstein) am learning and trying to adapt it.
Dillon writes: I've run some very promising MTurk pilots using my adaptive experimentation software. Compared to traditional random assignment, it increases statistical power, identifies higher-value treatments, and results in more precise estimates of the effectiveness of top-performing treatments. From simulations, I estimate that the gains from adaptive experimentation are approximately equivalent to increasing your sample size by 2x-8x (depending on the distribution of effect sizes).
This would allow us to run studies like Eric Schwitzgebel + Fiery Cushman's study on philosophical arguments to increase charitable giving much more effectively
Dillon Bowen: End of 3rd year of decision processes in Wharton PHd.
Here is a stats package for estimating effect sizes in multi-armed experiments. https://dsbowen.gitlab.io/conditional-inference/
I just made a getting started video: Welcome to Hemlock - YouTube
...running experiments with many arms and winnowing out the 'best ones' to learn the most/best.
See: adaptive design, adaptive sampling, dynamic design, reinforcement learning, exploration sampling, Thompson's sampling, Bayesian adaptive inference, multifactor experiment
Discrete vs continuous: switches vs. knobs
In our cases of the ‘options are discrete’, many knobs to turn, although some are discrete. There is a different version of this for discrete vs continuous
If we can order the different treatments (arms/knobs) as 'dimensions' we can infer more... Can do better thinking of them as a ‘multifactor experiment’ rather than 2 unrelated … several separate dimensions
"Model running in the background" trying to figure out ‘things about the effectiveness of the interventions you might use’
“Ex-post regret versus cumulative regret” … latter suggests Thompson sampling (Does Thompson's sampling take into account the length of the future period?)
Ex-post … Use machine learning to consider which characteristics matter and how much they matter … although he doesn’t know of papers that have looked at this, but assumes there are adaptive designs that incorporate this.
Statistical inference can be challenging with adaptive designs, but this is a ripe area of research
Dillon: has a paper on traditional statistical inference after an adaptive design.
Goals 'what kinds of inference':
The arm you using relative to (? the average arm?)
Which factors matter/joint distribution ….. Bayesian models
What to do with the data after you collect it (and what you should put in a pre-analysis-plan).
Notes from slack:
I’m finding some issues like this in analyzing rare events … not quite that rare, but still a few per thousand or a few per hundred.
I’m taking 2 statistical approaches to the analysis (discussion, code, and data in links):
Randomization inference (simulation) … for a sort of equivalence testing here
I think either of these could be ‘flipped around’ to be used for power calculation or ‘the Bayesian equivalent of power calculation’
My colleague Jamie Elsey has some expertise with the latter; we’re putting together our discussion HERE, although it’s mainly frequentist and not Bayesian ATM.
There are reasons 'some pre-registration' or at least 'declaring your intentions in advance' is worth doing even if you aren't aiming at scientific publication
https://gitlab.com/dsbowen/conditional-inference/-/blob/master/examples/bayes_primer.ipynb
Previous sections considered... 'How to get more people to care about '. 'How to get the "Einsteins" of the next generation interested in this.' And 'how do we introduce this to people?'
But, an equally-important concern may be... WHOM do we target? How do we do market profiling? Not just 'what do we present', but 'who do we present it to'
In this section, we cover the limited work that has been done on this, and the scope to do more.
Leander Rankwiler's recently (17 Feb 2023) did a scoping exercise for this. See "Detecting affinity for the ideas of effective altruism on social media". This work focuses on "the rationale, literature research, and data collection", and comes to relatively negative conclusions ("it's much less valuable to pursue than previously assumed"). This particularly reflects concerns that doing, publicly reporting, and acting on this research to 'target promising groups' may do some harm (see fold).
He also sees many sources of (statistical) bias in any feasible analysis.
In the sections below, we present and link recent and ongoing direct work that may also be relevant and informative.
Consider a study where
EA groups are asked to voluntarily participate (with no direct compensation)
to report the 'time spent on each recruiting activity',
and to ask their fellows/members 'how did you hear about our group?'
Suppose this finds
'per hour spent by the organizers, far fewer people report "tabling" as the source, relative to 'a direct email'.
Should we interpret this as
'direct emails are a more efficient use of time than tabling, thus groups should spend less time doing tabling and more time sending emails?'
Maybe, but we should be careful; there are other explanations and interpretations we should delve into. Some of these could be partially addressed through survey design, others through careful analysis. Other 'causality' issues may require an experiment/trial/test to get at.
Random variation: With a small sample of groups, these numbers may be particularly high or low (for tabling, for emails, etc) by chance; the averages for a 'typical group' may turn out to be very different.
This is the standard issue of statistical inference about a population from a sample.
The issue of 'misrepresenting the population' tends to be worse with smaller samples (here small number of groups, and small numbers of observed outcomes in each group; e.g., only a few fellows)
However, 'as Bayesians know' you can still draw valuable decision-relevant inferences from small samples. IMO (Reinstein) the "problem of small samples" tends to be overstated because we mainly learn about statistics designed for a particular scientific frequentist approach.
Selection/selectivity: The groups that 'opted in' to be part of this survey may not be a 'random draw' from the population of relevant groups. It may represent more careful or more enthusiastic groups, perhaps groups that are particularly analytical and not so good socially, etc. If some of the 'fellows' within the groups don't complete the survey, this could add another 'selection bias'.
Attribution with multiple sources: “How did you hear about this program?” This could be interpreted in several ways, probably “how did you first hear”. But in marketing sometimes people hear about something multiple times, and it’s hard to know which of these are pivotal in getting them to take action. (We could probably do something to make this question a bit more informative.)
“Lift”: some people might have signed up anyways even without the activities identify as ‘how they heard about it’. Other people may have been harder to reach, and for the latter (e.g.) tabling ‘Spoke to us while we were tabling’ may be pivotal.
Diminishing returns/hard limits on some activities … e.g., there may be only so many professors (or students) to email. After a few hours of this
Costs hours: the cost of these activities may not be fully proportional to the times spent … e.g., writing a professor may be mentally costly and possibly cost some other social capital. On the other hand tabling may be fun and social, and also generate interesting feedback (and other benefits that are harder to measure, like links with other groups also doing tabling)
Note that RP is not a 'part of this Market Testing team', but we want to coordinate with them and benefit from the survey and profiling work they are doing/have done. I try to map/link the space here.
Asks respondents to tick terms and people that they are familiar with (EA/non, real/rare/fake). If they have heard of EA, we follow-up with open-ended questions to detect actual understanding. We also ask about socio-demography and politics. Administered to a ‘national sample’.
(We will follow up with attitude surveys among those who have heard of EA.) We use Bayesian models to generate the posterior distributions of
share who know/understand EA within different groups,
weighted to be nationally representative (of each group).
\
Various survey projects ongoing
Developing measures of attitudes towards EA/Longtermism
Conducting large national surveys looking at predictors of these attitudes (including differences across groups)
Standard ‘message testing’ (what arguments/framings work best for outreach (including differences across groups)
__
Sample, Design, & Measures. We recruited a national online sample of 530 Americans. Participants read and reflected on an introduction to evidence based giving, and then completed our main outcomes of effective giving. Participants then completed a series of measures of their beliefs, behaviors, values, traits, sociodemographics, etc. The instrument, measures, and data are available upon request.
Was this a 'representative sample'? How were they recruited?
Note they 'read about EA first' ... perhaps making them vulnerable to demand effects?
DR: I've requested this data, but I think the authors are having trouble finding the time to dig this up
Primary Measures. To measure effective giving, we assessed several attitudes and behaviors; this summary presents results from a novel 7-item scale, the Support for Effective Giving scale (SEGS) [ ⍺ = .92], and an effective giving behavior allocation.
The items in SEGS assess general interest, desire to learn more, support for the movement, and willingness to share information with others, identify as an effective altruist, meet others who support the movement, and donate money based on effective giving principles. To approximate giving behavior, we presented participants with short descriptions of three causes Deworm the World Initiative, Make a Wish Foundation, and a local high school choir and had them allocate $100 between these groups and/or keeping it themselves.
Was the allocation purely hypothetical or incentivized in some way, perhaps 'one response was chosen'?
Secondary Measures.
To measure beliefs, behaviors, and traits of people who endorse effective giving, we employed measures of: perceived social norms, charitable donation beliefs and behaviors, self perceptions, empathy quotient ( EQ ) , empathic concern & personal distress ( IRI ), the five moral foundations ( MFQ 20 ) , the five factor personality model ( TIPI ), goal & strategy maximization ( MS S ), updated cognitive reflection tests ( CRT ), sociodemographics (e.g., age, gender & racial identity, income), politics & religion, familiarity with ‘the effective altruism ’ movement , and state residence
So far, the best overall model predicts 41% of the variance in support for effective giving.
Summarized in posts...
.... After participants read a general description of EA, they completed measures of their support for EA (e.g., attitudes and giving behaviors). Finally, participants answered a collection of questions measuring their beliefs, values, behaviors, demographic traits, and more.
The results suggest that the EA movement may be missing a much wider population of highly-engaged supporters. For example, not only were women more altruistic in general (a widely replicated finding), but they were also more supportive of EA specifically (even when controlling for generosity). And whites, atheists, and young people were no more likely to support EA than average. If anything, being black or Christian indicated a higher likelihood of supporting EA.
Moreover, the typical stereotype of the “EA personality” may be somewhat misguided. Many people – both within and outside the community – view EAs as cold, calculating types who use rationality to override their emotions—the sort of people who can easily ignore the beggar on the street. Yet the data suggest that the more empathetic someone is (in both cognition and affect), the more likely they are to support EA. Importantly, another key predictor was the psychological trait of ‘maximizing tendency,’ a desire to optimize for the best option when making decisions (rather than settle for something good enough).
RP has a remit and some funding to pursue this.
we focus on individuals who have taken the Giving What We Can Pledge: a pledge to donate at least 10% of your lifetime income to effective charities. In a global survey (N = 536) we examine cognitive and personality traits in Giving What We Can donors and compare them to country-matched controls. Compared to controls, Giving What We Can donors were better at identifying fearful faces, and more morally expansive. They were higher in actively open-minded thinking, need for cognition, and two subscales of utilitarianism (impartial beneficence and instrumental harm), but lower in maximizing tendency (a tendency to search for an optimal outcome). We found no differences between Giving What We Can donors and the control sample for empathy and compassion, and results for social dominance orientation were inconsistent across analyses.
Includes real donation choice question(s), rich survey and psychometric data, including 'mind in the eyes' empathy judgements
Students and nonstudents (local town population)
Consider Lown and XX paper... MITE empathy moderates the impact of political attitude, or something ... dissonance resolution Feldman, Ronsky, Lown https://onlinelibrary.wiley.com/doi/full/10.1111/pops.12620
mturk + qualtrics
ended up manipulating whether aid was government or charity, and domestic vs foreig; thought those would be moderated by MITE depending on their ideology/attitude? Also consider ... Empathy Regulation and Close-Mindedness Leonie Huddy, Stanley Feldman, Romeo Gray, Julie Wronski, Patrick Lown, and Elizabeth Connors Also asked about domestic welfare and foreign aid attitudes...
sample fairly large ... 1100 or so?
A brief outline and links to what has been done across organizations
This is possibly the best meta-resource as well as a source of original research
Our Animals, Food and Technology (AFT) survey tracks attitudes towards animal farming and animal product alternatives in the US. In 2020, as in the 2017 and 2019 iterations, we found significant opposition to various aspects of the animal farming industry, with a majority of people reporting discomfort with the industry, and strong support for a range of quite radical policy changes, such as banning slaughterhouses. The trend in attitudes between 2017 and 2020 is relatively stable, though slightly negative (not statistically significant). Notably, the number of people who consider animal farming to be one of the most important social issues fell from 2017 to 2019, and remained at this lower level in 2020.
Some replication work on the above
Various work including
__ DR: I'm awaiting permission to share the list.
#wild-animal-welfare-suffering-attitudes (Rethink Priorities, in progress)
See discussion in:
NBER Working Paper (2019/2021), Dietmar Fehr, Johanna Mollerstrom, and Ricardo Perez-Truglia
Attitudes towards global redistribution
"De-biasing" intervention (how rich participants are relative to Germans, how rich Germany is globally)
Tied to
German Socio-Economic Panel (SOEP), a representative longitudinal study of German households. The SOEP contains an innovation sample (SOEP-IS) allowing researchers to implement tailor-made survey experiments.
a two-year, face-to-face survey experiment on a representative sample of Germans. We measure how individuals form perceptions of their ranks in the national and global income distributions, and how those perceptions relate to their national and global policy preferences. [Their main result]: We find that Germans systematically underestimate their true place in the world’s income distribution, but that correcting those misperceptions does not affect their support for policies related to global inequality.
They ask about support for global redistribution, international aid institutions, globalization, immigration, and more, and have an incentivized giving choice. These are (arguably) measures of support for some EA behaviors/attitudes.
I suspect that this data could be tied to a variety of rich (personality? demographic?) measures in the SOEP. A predictive model for actual EA/Effective giving targeting in other related contexts? If so, let's focus on things we are likely to observe in those other contexts (or at least likely to have proxies for). If there are any 'leaks' (not sure I'm using the term correctly)... missing a single feature could ruin the predictive power of the whole model.
Causal interpretations (very challenging)?
Here 'nearly immutable characteristics' (like ethnicity, age, parental background, maybe some deep psych traits) might be a bit more convincing
*Descriptive* (whatever we mean by that)
Some things like "Previous donations" might be sort of colliders or 'confounds' (I'm a bit vague here) in interpreting other associations
I tried to tackle some of this stuff (incompletely) in analyzing the EA survey donations
See Followup with Thomas Ptashnik in next section
Google Analytics (and other tools) collects or predicts the demographics and market profile of traffic on sites like givingwhatwecan.org.
This gives us a sense of the
'existing demographics' of those committed and/or interested ... in ways not picked up in (e.g.) the EA survey
who is 'interested but not committed' ... possibly low-hanging fruit
Differentiating our work (previous research in psychology, economics) we write down what our basic consensus and knowledge
Existing theories, effective altruistic actions existing
What are the problems and questions we are dealing with?
What questions do we have what challenges are we facing?
What previous work has been done to investigate these questions?
What evidence is there so far on these questions?
What are the relevant theories of behavior for this work?
An overall characterization of widely-cited and 'conventional-wisdom' evidenced background drivers barriers to effectiveness X charitable giving
We focus on the 'barriers' or 'hurdles to giving effectively' among individuals who already engage in some charitable giving and other-regarding acts. Loosely, a donor would need to "jump over all of these hurdles" and cross each of these barriers in order to be giving effectively.
A conceptual breakdown of barriers:
Base values may be (non) utilitarian: People are optimising their own 'X', which does not coincide with impact --> no puzzle?
Avoiding information about effectiveness: Even if people want to optimise impact, they may specifically dislike and avoid gathering information about effectiveness in a charitable giving setting
Presenting effectiveness information may backfire: E.g., if it switches off the 'generous part of the brain', gets people to think in a more 'market' mode, or makes people indecisive
Judgement/cognition failures, quantitative biases, information failure: People try but fail to optimize, and/or have persistent incorrect beliefs
Emotion overrides cognition: Our brain serves two masters, those decisions are not consistent
Identity and signalling: Effectiveness in giving clashes with our self-image/self-beliefs, or with how we want to appear to others
Systemic factors (and inertia): social systems leading to pressure and incentives from others to give to local or less-effective causes. Even if impact is a goal these systems take a long time to adjust.
I present and discuss this breakdown, a more practical breakdown, and specific examples in each category in the Barriers breakdown part of the synthesis. I go into further detail an (work to) present evidence on each of these in later sections of that site. (I plan to go through that work and extract only the key, most practical elements).
These barriers are also mapped, and connected with tools and evidence in this (other) Airtable.
See 'barriers' View
Motivating our project; feel free to be brief and link external content. "How little we know"
Draw from and link EA Barriers Project (Reinstein) on "Presenting the Puzzle"
From EA Barriers Project:
As Burum, Nowak, and Hoffman (2020) state: “We donate billions of dollars to charities each year, yet much of our giving is ineffective. Why are we motivated to give, but not motivated to give effectively?”
... raises two related questions:
I. “Why don’t we give more to the most effective charities and to those most in need?” and
II. “Why are we not more efficient with our giving choices?”
To address this, we must understand what drives giving choices, and how people react to the presentation of charity-effectiveness information
There are two related and largely unresolved puzzles:
Why are people not more generous with the most highly effective causes? and
When they give to charity why do they not choose more effective charities?
There is some evidence on this but it is far from definitive. We do not expect there to be only a single answer to these questions; there may be a set of beliefs, biases, preferences, and underlying circumstances driving this. We would like to understand which of these are robustly supported by the evidence, and will have a sense of how important each of these are in terms of the magnitude of driving and absence of effective giving. There has been only a limited amount of research into this and it has not been systematic, coordinated, nor heavily funded.
We seek to understand because we believe that there is potential to change attitudes, beliefs, and actions (primarily charitable giving, but also political and voting behaviour and workplace/career choices). Different charitable appeals, information interventions and approaches may substantially change peoples charity choices. We see potential for changing the “domain” of causes chosen (e.g., international versus US domestic) as well as the effectiveness of the charities chosen within these categories. (However, we have some disagreement over the relative potential for either of these.)
Our main ‘policy’ audience includes both effective nonprofit organisations and ‘effective altruists’. The EA movement is highly-motivated, growing, and gaining funding. However, it represents a niche audience: the ‘hyper-analytic but morally-scrupulous’. EA organisations have focused on identifying effective causes and career paths, but have pursued neither extensive outreach nor ‘market research’ on a larger audience (see Charity Science, Gates Foundation/Ideas42). `
Academic work:
@loewensteinScarecrowTinMan2007
introduction to @Berman2018, @baron2011heuristics)
introduction to Caviola et al: "on how both incorrect beliefs and preferences for ineffective charities contribute to ineffective giving"
@greenhalghSystematicReviewBarriers2020 (qualitative, focuses on largest philanthropists only)
Non-academic/unpublished:
'Behavior and Charitable Giving' (Ideas42, 2016),
'The Psychology of Effective Altruism' (Miller, 2016, slides only).
Overall, these have not been detailed or systematic. While Caviola et al, 2021, is probably the strongest, most relevant, and most insightful (and has some connection to the structure presented in the 'Barriers' project), it does not drill deeply into the strength of the evidence and the relative importance of each factor. However, this may stem from a small amount of available evidence to survey.
Ideas42 wrote (ibid)
We did not find many field-based, experimental studies on the factors that encourage people to choose thoughtfully among charities or to plan ahead to give.
A working definition is provided and discussed HERE I (Reinstein) provide a critical discussion of some standard economic models of giving in this context HERE
See, and coalesce ideas from the links below (and more)
Here, we propose methods for grouping, organizing, and categorizing these tools for motivating effective giving and action:
Theoretical frameworks --> tool categories
Certain outcomes are relevant to some tools only
Atheoretical 'trying different marketing colors' and tools that push several buttons
As well as
identifiable victims vs statistical (etc), (DR: Some groups have principled objections to presenting identified victims; which ones do not?)
emotional vs factual/statements,
videos v images v text,
positive v negative valence,
opportunity v obligation,
cause areas (Not sure what exactly this meant)
different framings for specific EA orgs
e.g., for GWWC they want to test 1% v 10% pledge asks,
for CES they want to test saving-democracy v representation messaging,
for Humane League they want to test different types of animals, etc)
I very briefly discuss particular tools in the Bookdown:
A more organized categorization of barriers can be found in an airtable database (view below), linked to tables of specific tools, theories, barriers, etc. (This can be accessed HERE; it is not the airtable for this project, although we link in some content).
The above table links a set of specific tools, evaluating their relevance for effective giving:
We are considering a narrower set of tools (in a different airtable, the airtable for the current project...
Thomas Ptashnik is a Psychology PhD student interested in working on this with us. He is using the SOEP-Core data and familiar with SEM/Latent variable methods.
These items correspond to the SOEP-IS surveys, which can be found here (use item names, like Q132, to search quickly
2017: https://paneldata.org/soep-is/inst/soep-is-2017-f
2018: https://paneldata.org/soep-is/inst/soep-is-2018-f
These links also mention that individuals with preexisting data access can apply for expanded access. I [Thomas] have access to SOEP-core version 36 (1984-2020 surveys),..
I think we might see positive responses to the Fehr et al questions and donation choices as ‘necessary but not sufficient' for people to become effective givers or even EAs. If (especially in spite of the de-biasing) people still don’t support international redistribution, international orgs, and don’t opt to give from the lottery earnings to the global poor person … I think they are very unlikely to be susceptible to an EA or effective giving (e.g., GiveWell) appeal. (See further discussion and debate on this below). (But, as a check on this, it might be good to try to ask these same questions on a sample of actual EA’s and effective givers, and a comparison group!. #surveyexperiments)
I envision two related projects on the same data: 1. Building a 'portable' model for prediction to aid targeting and 2. Building a 'deeper' model to aid understanding
I’m hoping that looking for predictors of (or ‘coherent factors explaining’) these responses in the SOEP data would prove useful for organizations like GWWC to consider ‘which groups to target in doing outreach’ (and perhaps especially ‘which groups to rule out’)
I hope we can do a sort of ‘leak-proof validated predictive ML model for this’
perhaps especially relevant for the German/EU context
Thomas: After talking it over with some colleagues, I think this approach is our best bet in terms of developing something with practical utility that still has a chance of being published in an academic journal. This is not my area of expertise, but if I remember correctly you have some R code already written. So I should quickly be able to put something together.
2. An (exploratory model) to help understand key factors that might be driving EA-adjacent attitudes and behaviors, offering insight into ‘what drives people towards or away from this mindset’.
Here we could engage the richer set of SOEP variables and consider latent factors
Red team:
[red team]
I guess it will be interesting to find out through your analysis:
Are these measures predicted by plain altruism + cosmopolitanism (which a priori we might say are more likely to be connected to EA)
Or are these measures predicted by egalitarianism + belief we should repay the third world / belief the rich should help the poor (which seem like they may be less closely connected with EA)*
*of course EAs are overwhelmingly liberal/egalitarian, but liberal/egalitarians are overwhelmingly not EA, which I think is an important complication"
RT2: Is there any way you can think of to get at EA more like a style of thinking/justification of choices as opposed to possibly the highly context-dependent choices are themselves? Some kind of relevant psychometric things are probably possible e.g., need for cognition or something similar RT1:
One option create or use measures of maximising + cosmopolitanism + altruism (or of maximising cosmopolitan altruism) ... maybe we are getting at 'EA style of thinking'. And if we can show that these more abstract measures are connected to behavioural or otherwise more concrete measures of EA inclination (whether that's decisions/choices, signing up for mailing list or something else) then it does seem reasonable to think of these as capturing EA inclination.
The risk otherwise is that theoretically we think these 3 things correspond to EA thinking... and actually they don't ...
Consider NFC, IRT, Rationality Quotient etc. as predictors of EA-inclination \
DR: My conception was maximizing + cosmopolitanism + _altruism + willing-to-sacrifice/non-competitiveness …_I think many people think “I should work to help humanity” but also think ‘yeah but I’ll be a sucker if I give to charity while my neighbor gets a new swimming pool and Hawaii holiday…’That’s where “willing-to-sacrifice/non-competitiveness” comes in, in my mind. (It needs a better name?)I think this last trait more important for effective giving than for EA-intellectual-engagement… and it may not be important at all for the latter.
Thomas: In psychology, altruism captures this notion. Prosociality is a concept of helping others but allows for self-concern, while altruism is distinguished by a purer form of selflessness (I have a paper under review that goes into detail about this, which I can privately share....argh, the closed doors of academia). Fortunately, altruism is widely studied and there are even a few items that capture it in the SOEP dataset. \
(DR ideas)
IMO it would be nice to have some meaningful behavioral (incentivized) measures on top of the ‘psych’ ones. The ‘donation to the very poor’ measure in Fehr et al gets at this a bit … although its a pretty small probablistic sacrifice. And I suspect it measures all three of the above except maximizing. And I don’t think these things are all separable, so I think that the fact that it measures ‘altruism and willing to sacrifice in a cosmopolitan-relevant context’ is good.
It would also be pretty nice to have a behavioral/incentivized measure of ‘maximizing in an altruistic context’ …If Fehr ea had asked them to (e.g.) allocate giving among a German poor person, an African poor person, and themselves, this might have been a decent measure.
(We have this choice in some other contexts though … not as rich data but maybe worth digging into). Why might that choice have been better (in some ways) than a hypothetical choice? Because I imagine in a hypothetical choice some people would be like “OK they obviously want me to say support the poor person in Africa, and I see the maximization arguments, so, fine.'But when it involves real money, and even their own money, I expect that for some people, other motives will outweigh the ‘maximizing motive’…“wait, I’d rather keep the money than give it to an African who will waste it”“wait, if this is real, I’d rather help someone local”.
DR: See sidebar comments
Lasso regression to identify the most salient cluster [DR: how is this defined?] of predictors for effective giving
I will use k-fold cross-validation to compare a lasso model with ridge regression and OLS to confirm it is the best method for handling our data [DR: 'best in what sense? I recommend the elastic net approach if possible.]
To start, I’m just considering the 2017 survey and the control group (i.e., those who weren’t notified of their position in the national and global income distribution (~700 individuals). We can expand to the 2018 survey and the treatment group in future analyses using the same method (although some items may not be included across surveys).
Q280 and 281 in the SOEP-IS dataset developed by Fehr et al. (2019)
You were paired with another household in Kenya or Uganda. This household belongs to the poorest 10 percent of households worldwide. Now, you have 50 EUR at your disposal and can split this amount between the other household and you in any way you want. If this task is selected for payout, you will receive the amount you decided to keep at the end of the interview. The amount you want to give the other household will be given in full to the other household (without transaction costs) at the end of the field period by Heidelberg University via a charitable organization. In full means that every given euro will be received by the other household 1:1. A leaflet with information about the donations will be given to you after you have made your decision. I ask you to make this decision alone now.”
“How much of the 50 EUR do you want to keep and how much do you want to give the other household?”
2017 survey questions: https://paneldata.org/soep-is/inst/soep-is-2017-f
Below I list variables below in terms of what the intended construct I’m trying to get at and the proxy measures that are available within the SOEP dataset.
[DR: I think 'previous failire to find significant effects' shouldn't be reason to exclude!]
Variables held constant by the survey design (see Bekkers & Wiepking 2007 for detailed explanation): Solicitation, benefits, reputation, and efficacy.
Religious involvement*
One of the most studied variables in philanthropic studies. However, a large body of research finds that religious involvement is not related (or even inversely related) to secular giving (Brooks, 2005; Lyons & Nivison-Smith, 2006; Lyons & Passey, 2005). Still, given its prominence (and that fact that there are religious EA groups), it is worth including in our analysis.
“Do you belong to a church or religious group?”
-----------------
“What church or religious group do you belong to?”
Level of education*
Has been found to have a positive relationship with secular giving (Yen, 2002), more EA-aligned giving (e.g., development aid versus emergency aid; Srnka et al., 2003), and there are conflicting results on whether education impacts the amount donated (c.f., Schervish & Havens, 1997; Brooks, 2002).
“What type of vocational training or university degree did you receive?”
Field of study*
A handful of studies have found graduates of different fields to be differentially generous, although which groups are at the top is inconclusive (c.f., Bekkers & De Graaf, 2006; Belfield & Beney, 2000)
Not available for SOEP-IS
Income*
Higher income households donate higher amounts than lower ones, however, the relationship with discretionary income is complex and unresolved (McClelland & Brooks, 2004). Income elasticity has been shown to be a salient predictor (Brooks, 2005), but for our purposes, general net income seems like the most sensible since this is information EA organizations might be able to obtain or estimate.
“How satisfied are you with your household income?”
----------
----------
Age*
Unclear relationship: generally, appears to increase over time and level off around retirement, but this relationship is highly dependent on covariates such as church attendance, number of children, and marital status.
Should be available. I’m waiting for confirmation.
Number of children*
Positively related to philanthropy in most studies, but the age of the children may influence the direction and magnitude of the effect, specifically when they are younger than 14 (Okten & Osili, 2004) and 18 (Okunade & Berl, 1997).
According to ‘My Infratest’, these are the children in your household that were born in 2001 or later. Please state whether these children still live in your household.”
----------
…accompanied by companion question: “Do more children live in your household which were born in 2001 or later?”
Marital status*
Mostly found to be positively related to giving, although a number of studies finding null effects (Apinunmahakul & Devlin, 2004; Carroll et al., 2006) call into question the magnitude of this effect.
“What is your family status?”
Employment*
The employed generally donate more than the unemployed (Chang, 2005a&b); those who work more (days and hours) donate more (Bekkers, 2004; Yamauchi & Yokoyama, 2005); retirees are highly charitable; self-employed are less generous (Carroll et al., 2006); and public service employees are more likely to engage in philanthropy than for-profit workers (Houston, 2006).
…could confirm officially unemployed: “Are you registered as unemployed at the Employment Office?”
“What is your current occupational status as a self-employed?”
…closest question I could find that gets at something other than for-profit work: “Do you work for a public sector employer?”
Gender*
Mixed findings in general and no finding when looking at one-person households (Andreoni et al., 2003). Still, given the ubiquity of this variable, it is sensible to include it in the model even though I have little faith it will be significant.
Should be available. I’m waiting for confirmation.
Race*
Caucasians generally give more, but this finding is tempered by the cause (non-whites donate more to the poor and religious organizations; Brooks, 2004; Brown & Ferris, 2007; Smith & Sikkink, 1998).
Should be available. I’m waiting for confirmation.
Parental background*
Higher levels of parental education, parental religious
involvement, and parental volunteering in the past are related to higher amounts currently donated by children (Bekkers 2005a). While current parental income and church attendance also predict giving (Lunn et al., 2001; Marr et al., 2005).
I thought a proxy for parent’s occupational prestige might be a salient predictor. Questions 496-502 cover the mother’s background and have the exact same wording.
Questions split depending on occupation and all contain the header: “What was your father’s occupational status as…”
“A self-employed person?”
“A civil servant?”
“A white-collar worker?”
“A blue-collar worker?”
“What type of school leaving certificate did your father attain?”
“Did your father complete vocational training or a university degree?”
Personality*
Donations have been found to increase with emotional stability and extraversion (Bekkers, 2006b), as well as openness to experience (Levy et al., 2002). General social trust has also been found to be a salient predictor (Brooks, 2005; Micklewright & Schnepf, 2007). Empathy has been found to be related to donations (Bekkers & Wilhem, 2006), as well as altruism.
Big Five Personality traits:
Agreeableness: “is considerate and kind to others”
Openness to experience: “is eager for knowledge”
The self-control scale. Sample item: “I am good at resisting temptation.” 10-item scale split between two links below.
Cognitive ability*
Persons with higher verbal scores (Bekkers & De Graaf, 2006), IQ (Millet & Dewitte, 2007), GPA (Marr et al., 2005), and ability to think in abstract terms (Levy et al., 2002) donate more.
Innovation exercise to assess emotional intelligence.
“What emotion was shown by the individual? For every emotion, please rate how strongly you perceived it. If you saw a group, please rate the emotion of the individual in the middle.”
For questions assessing quantitative skills (probabilities):
“Out of 1,000 people in a small town 500 are members of a choir. Out of these 500 members in the choir 100 are men. Out of the 500 inhabitants that are not in the choir 300 are men. What is the probability that a randomly drawn man is a member of the choir? Please indicate the probability in percent.”
Items 888-928 assess the ability to do expected utility calculations:
“Please imagine the following situation: You have the choice between a safe payment and a lottery. In detail: Do you prefer a 50% opportunity to win 300 Euro while you do not win anything by 50% or a safe payment of 160 Euro.”
Quantitative skills:
“Now answer another question within 20 seconds. Continue the multiplication tables of the base 17 as far as possible. Starting with 17, 34, etc. The time is running - now.”
Context*
Donations are influenced by behavior of coworkers in the same salary quartile (positive; Carman, 2006), income inequality (negative; Okten & Osili, 2004), individualistic cultures (positive; Kemmelmeier et al., 2006), and the stock market (positive; Drezner, 2006).
Stock market optimism: “Initially we focus on the next year (next 12 months). Do you expect the DAX [German blue-chip index] to show rather profit or loss compared to the current value?”
Numeric version: “Expressed in numbers: What [Profit/Loss] do you expect for the next year overall in percent?”
This same question stem of stock market optimism is used for items about the next two, ten, and thirty years
Occupational prestige*
Generally, positively related to donations (Carroll, McCarthy, & Newman, 2006).
Current occupation (open question):
----------
Occupation (answer choices included):
----------Political orientation*
Each occupation choice is then further refined:
Blue-collar worker
White-collar worker
Civil servant
Apprentice/intern
Self-employed
Political orientation*
Previously, no differences were found for secular donations (Brooks, 2005), but Fehr et al. (2019: 26) find that “for right-of-center respondents, there are indications that higher national relative income is related both correlationally and causally to more giving to poor Germans and Kenyans.”
Item designed by Fehr et al. (2019):
“In politics people often talk about ‘left’ and ‘right’ to mark different political attitudes. If you think about your own political attitude: Where would you place yourself?”
Locus of control*
Persons with an internal locus of control are more likely to engage in philanthropy and other formal helping behaviors (Amato, 1985).
Ten item scale with the stem: “The following statements describe different attitudes towards life and the future. To which degree do you personally agree with the individual statements?”
Health*
People in better health donate more (Bekkers 2006b, Bekkers & De Graaf, 2006).
“How would you describe your current health?”
“How satisfied are you with your health?”
Mood*
Positive affect facilitates giving, while negative moods may also facilitate giving in specific circumstances but it is conditional on lots of factors (e.g., helping contains minimal barriers and when prompted to think about the negative feelings that would result from not helping; Cunningham et al., 1980; Weyant, 1978).
Short scale of emotions (angry, afraid, happy, sad):
“Thinking back on the past four weeks, please state how often you have experienced each of the following feelings very rarely, rarely, occasionally, often, or very often. How often have you felt...”
Values*
Endorse of prosocial values has a positive association with charitable giving. This is also true of individuals who are less materialistic (Sargeant et al., 2000) and care about justice (Todd & Lawson, 1999).
Questions 172-175 on justice. For example, the stem “To begin with it is about situations which result in others advantage and your disadvantage, because you were penalized, exploited or treated unfair. To what extent do you agree with the following statements?” Followed by “It makes me angry when other are undeservingly better off than me.”
----------
Prosocial work values, particularly of interest: “Socially responsible and important work” and “Having much influence.”
Previous donations*
Charitable giving is to some extent habitual behavior (Barrett, 1991; Barrett et al., 1997).
Not available for SOEP-IS
Optimism
Belief that the future could be better might provide motivation to influence it in becoming better.
“When you think about the future, are you…”
Likelihood of events (e.g., financially successful, not get any serious illness, successful at work, content in general) happening compared to other people the same age and gender.
Life satisfaction
Spending money on others has been shown to have a consistent, causal impact on well-being (Aknin, Barrington-Leigh, Dunn, Helliwell, Biswas-Diener, Kemeza, Nyende, Ashton-James, & Norton, 2010). “One possibility is reverse causality, that is, that those who are inherently happier by nature are also more likely to help individuals” (Moynihan, DeLeire, & Enami, 2015).
“In conclusion, we would like to ask you about your satisfaction with your life in general. How satisfied are you with your life, all things considered?”
Risk propensity
Cluelessness has been cited as a case against longtermism (Greaves & MacAskill, 2021). Thus, individuals that are predisposed to EA but are risk-adverse may be more likely to make global health and development donations.
Stem: “What do you think about yourself: How prepared to take risks are you in general?”
“not ready to take risk at all ... ready to take risk”
“What did you think of when you made your estimate (i.e., the value) regarding your preparedness to take risks?”
DR comments:
A very interesting list of features
were these all asked before the charity questions? (I'm worried about reverse causality otherwise)
maybe remove 'unavailable' rows for space\
We should discuss how the fitted model will be used and interpreted ... maybe identifying a few collections of useful subsets:
DR notes on 15 Dec 2021 meeting with
“How satisfied are you with your personal income?”
“I earned [net income]”
“What do you think is your monthly gross salary in one year?”
Explain how to add content, embed, groups vs pages vs subpages, how we're organizing it, how/who to join/invite, , payment/cost, the link with git/github (for tech people), formatting tweaks
Rather than chains of disconnected emails and many unlinked Google docs, I (David Reinstein) thought it would be better to organize our project with this well-structured format.
This version is currently PUBLIC but unlisted. It doesn't contain information on on our trials or marketing activities (as of 18 Jan 2022), but we hope to be adding and integrating some details soon. We hope to make most of this public in due time, in line with information sharing and open science.
"Groups" can hold multiple pages and pages can have sub-pages. But groups cannot have subgroups and the groups have no direct link (while pages do). (In the 'git repo' groups seem to be represented by folders).
If you have 'write (Editor) access' ....
Update: as of 15 Oct 2021 Gitbook has changed its protocols. You now need to
click the icon in the upper right to 'start a change request',
and then 'submit' this request when you are ready (ideally, with a brief informative message explaining what you have done.
Give it a try. Once you 'submit', you, or someone else can 'merge' it in.
In newly created blocks/elements "command-slash" (on mac) brings up a lot of cool options (scroll down)
Typing the "@" symbol offers a quick way to link other pages in this book
If you have the Administrator status, you can merge in your own, or others' changes.
What if I get a 'conflict'? If 2 people edit simultaneously and both make changes they try to merge in, this can happen. It should be simple enough to resolve. Just find the icon for the bits indicating a conflict, and choose which version you want to keep.
It should be simple enough to resolve.
Just find the icon for the bits indicating a conflict in the outline bar (that arrow triangle thing), go to that section/those sections, and choose which version you want to keep.
This Gitbook is connected to the private github-hosted repo here:
It 'backs up' nicely to a set of easy-to-follow markdown files and folders. If you prefer to work offline, in nice 'raw text formats' (rather than via the web interface)... you should be able to edit those files in any interface and push/merge the content in. (If you are familiar with git and Github. The markdown and project organization syntax is a little bit distinct from others I've used, such as Rmd/bookdown. \
The folders have meaning for the structure of sections, I think, but the SUMMARY.md file seems to govern most of it. \
There is a particular dash-separated 'description' section at the top of each .md And there are some special code elements like
{% ="URL HERE" %}
<div data-gb-custom-block data-tag="hint" data-style='info'>
Hint content here
</div>
Slack group (see esp. effective_giving_team
channel)
Airtable, Slack, etc.
Airtable is an online database that is user-friendly and social. We are using the airtable "GWWC+ testing/trial ideas" (ask for edit access) to keep a simple listing of key elements and structured information; in conjunction with this Gitbook.
The first table in the airtable (picture below) explains all the other tables
A good way of starting with Airtable/databases is to think
These are just a bunch of spreadsheets or individual ‘data sets’; I’ll treat them as separate for now
Nice, it’s a bit easier to quickly add entries if I choose single or multi-select field types , or checkboxes
Hey look, if I make this a “Link” field type can easily add rows from sheet B into sheet A, that’s cool!
I can also ‘create new rows in B while adding them to A’
Cool, sheet B now has a column indicating where it has been entered into sheet A
Hmm, sheet A has stuff on it that is not relevant for our partner; let me create a simpler ‘view’ of sheet A filtering out rows hiding columns that are not relevant to our partner
How can we best present information about effectiveness (dollar per impact, impact-per-dollar, GW ratings, etc)?
See discussions of previous work:
In
See giveifyouwin.org
Conditional pledges (‘Give if you Win’), esp.
Work with EA orgs at universities and in companies; possibly working with 80k hours &/or FoundersPledge, give opportunity for career guidance
Control: Ask about career goal/target, follow up in 1 year, ask for pledge then
Treatment: Same but ask initially for conditional pledge (‘if you attain the goal’)
See project (hope to scale up evidence from smaller contexts)
DR: How people respond to animal advocacy ads and what appeals to them more? XXX redacted
Lots of things like this on Faunalytics
There was no clear trend showing which tactics were most effective. Among the top ten, some used writing, pictures or virtual reality to show the suffering of animals on factory farms. Others added information about the health and environmental impacts of factory farming. Still others gave specific suggestions on how to eat less meat or discussed laws to improve how animals are treated on farms.
There was no clear trend showing which psychological strategies were most effective, although many different strategies were employed. Tactics often employed descriptions of how eating meat is becoming less normal, the emotions of farm animals, individual victims of factory farming, comparisons between farm animals and pets, and specific suggestions for how to eat less meat.The journal-published version:
This is a re-analysis of one of the earlier studies the community has done on messaging:
Some other older research (XXX redacted):
DR: Thanks. But I guess this stuff was mainly trying to appeal to the general public. XXX REDACTED I think the group that is being targeted is rather different.
Anytthing in the above seems specifically relevant to this, like stuff trying to get people who are already interested in animals to pursue it more seriously?
A description of our most promising academic paper ideas based on the opportunities we have so far
Why list these here? By identifying specific hypotheses (for an academic paper), it will help:
Generate ideas for non-profits to test
Avoid indecision about what ideas to try
Keep those of us with academic publishing incentives motivated
In sum, to give us some direction and focus.
Current ideas for papers are summarized in embedded. The document is divided into:
one section for each idea we have fleshed out, so far... (at Oct 23, this includes):
shared community insight framing
Donor responses to “quantitative ‘per dollar’ impact information” and presentations of this (DR added)
warm glow (DR: 'internal reward) from effectiveness
a list of others' ideas and general themes we have yet to flesh out
some classic ideas we could test (as an alternative to developing novel hypotheses)
Here's the doc:
for embedded content (esp. Google docs), ... multi-tab tab elements: ..And callout boxes, including 'hints'\
Innovations in Fundraising was an academic impact project and resource. innovationsinfundraising.org was hosted as an interactive Dokuwiki.
It aimed:
To explain and promote practical fundraising innovations stemming from academic research, to encourage trials and experiments, to promote effective giving and encourage collaboration and knowledge-sharing.
A key resource was a linked interactive database of 1. relevant papers, and 2. relevant 'tools'. Our automation tools allowed us to update this content via an Airtable, integrating it into the formatted DokuWiki table.
The project is no longer being hosted. Please contact David Reinstein to request access to any of the resources (or the underlying Airtable).
I (David Reinstein) took down innovationsinfundraising.org for several reasons including:
I didn't have time and funding to keep it updated, and I didn't want this to 'crowd out' others' work
Hosting costs (roughly $400 per year)
It was largely superceded (at least in my own work) by other resources and projects, including "EA Market Testing" (the present Gitbook, and linked resources)
I would consider reviving this in the future, and would be happy to join it with other maintailed resources. Please contact me if you would like to pursue this.
Considering 'what information and ratings are out there about charity effectiveness and how is it/should it/could it be presented
What are the existing sources of information and ratings about charity effectiveness? How credible are these? How are these presented, and how could/should they be presented?
Some weaknesses in their metrics -- See earlier post HERE
Updates: Went through recent impact ratings (briefly picking charities, found some limitations:
"$670 provides an additional year of healthy life to a blood transfusion patient." (Note this is based on US data)
This seems implausible as an actual 'impact' of a $670 donation; it is not clearly considering the counterfactual
Updates 4 Oct 2022: There may be some promising developments within Charity Navigator; watch this space
ratings have little or nothing to do with impact. - Guide Dogs for the Blind and Make-a-Wish are both top ('Platinum' rated) ... we know these are ineffective (classic examples)- Against Malaria Foundation is unrated and "New Incentives" gets the lower 'Gold' rating -- both are top-rated on GiveWell. Also, note the Guidestar criteria:
The Platinum Seal of Transparency indicates that the Foundation shares clear and important information with the public about our goals, strategies, capabilities, achievements and progress indicators that highlight the difference the Foundation makes in the world.\
It's about transparency, not impact.
: 100 out of 100 impact rating
Late-2024 update: This project is on hiatus/moved
Note from David Reinstein: The EA Market Testing team has not been active since about August 2023. Some aspects of this project have been subsumed by Giving What We Can and their Effective Giving Global Coordination and Incubation (lead: Lucas Moore).
Nonetheless, you may find the resources and findings here useful. I'm happy to answer questions about this work.
I am now mainly focused on making a success. I hope to return to some aspects of the EAMT and effective giving research projects in the future. If you are interested in engaging with this, helping pursue the research and impact, or funding this agenda, please contact me at daaronr@gmail.com.
Late-2024 update: This project is on hiatus/moved
Note from David Reinstein: The EA Market Testing team has not been active since about August 2023. Some aspects of this project have been subsumed by Giving What We Can and their Effective Giving Global Coordination and Incubation (lead: Lucas Moore).
Nonetheless, you may find the resources and findings here useful. I'm happy to answer questions about this work.
I am now mainly focused on making a success. I hope to return to some aspects of the EAMT and effective giving research projects in the future. If you are interested in engaging with this, helping pursue the research and impact, or funding this agenda, please contact me at daaronr@gmail.com.
Late-2024 update: This project is on hiatus/moved
Note from David Reinstein: The EA Market Testing team has not been active since about August 2023. Some aspects of this project have been subsumed by Giving What We Can and their Effective Giving Global Coordination and Incubation (lead: Lucas Moore).
Nonetheless, you may find the resources and findings here useful. I'm happy to answer questions about this work.
I am now mainly focused on making a success. I hope to return to some aspects of the EAMT and effective giving research projects in the future. If you are interested in engaging with this, helping pursue the research and impact, or funding this agenda, please contact me at daaronr@gmail.com.
Late-2024 update: This project is on hiatus/moved
Note from David Reinstein: The EA Market Testing team has not been active since about August 2023. Some aspects of this project have been subsumed by Giving What We Can and their Effective Giving Global Coordination and Incubation (lead: Lucas Moore).
Nonetheless, you may find the resources and findings here useful. I'm happy to answer questions about this work.
I am now mainly focused on making a success. I hope to return to some aspects of the EAMT and effective giving research projects in the future. If you are interested in engaging with this, helping pursue the research and impact, or funding this agenda, please contact me at daaronr@gmail.com.