1 of 2

Pre-giving-tues. email A/B

Context: Donation 'upsell' to existing pledgers

Question: Are effectiveness-minded (EA-adjacent) donors and pledgers more motivated to donate by

"A": (non-quantitative) presentation of impact and effectiveness (as in standard OftW pitch)
"B": Emotional appeals and 'identified victim' images

Further information on experiment and outcomes in in-depth replicable analysis, organized in dynamic document here

General idea, main 'hypothesis'

Are effectiveness-minded (EA-adjacent) donors and pledgers more motivated to donate by

"A": (non-quantitative) presentation of impact and effectiveness (as in standard OftW pitch)
"B": Emotional appeals and 'identified victim' images

In the context of One for The World's (OFTW) 'giving season upselling campaign', potentially generalizable to other contexts.

Academic framing: "Does the Identifiable Victims Effect (see e.g., the meta-analysis by Lee and Feeley, 2016) also motivate the most analytical and committed donors?"

Background and context

One for The World's (OFTW) 'giving season upselling campaign''

10 emails total over the course of November were sent in preparation for GivingTuesday

Point of contact (at organization running trial)

Timing of trial

: November 10, 18, 23, all in 2021, but may be delayed for feasibility

Digital location where project 'lives' (planning, material, data)

Present Gitbook, Google doc linked below, preregistration (OSF), and github/git repo

Environment/context for trial

Emails ... to existing OftW pledgers (asking for additional donations in Giving Season)

All 10 emails had the same CTA: make an additional $100 donation for the giving season/GivingTuesday on top of their recurring monthly pledge donation.

Participant universe and sample size

Roughly 4000 participants, as described.

A series of three campaign emails will be sent out by OftW to their regular email lists, to roughly 4000 participants, as described.

Key treatment(s)

A list of ~4500 contacts (activated pledgers) was split into two treatment groups.
Treatment Group A received emails that were focused on the contact's impact
while Treatment Group B received emails that were focused on individual stories of beneficiaries

See preregistration, treatment specifics

Treatment assignment procedure

See preregistration How many ... conditions

Outcome data

Targeting: Donation incidence and amount in the relevant 'giving season' and over the next year, specifically described in prereg under

key dependent variable

Data storage/form:

MailChimp data (Chloe is sharing this),
Reports on donations (Kennan is gathering this)

Optional/suggested additions

Planned analysis methods, preregistration link here

Cost of running trial/promotion: Time costs only (as far as I know)

Proposed/implementing design (language)

(Link)

Pre-registration work

Pre-registered on OSF in 'AsPredicted' format, content incorporated here here

Preliminary results

Overview:

The Emotion treatment leads to significantly fewer people opening emails, but more people clicking on the in-email donation link (relative to the standard Impact information treatment). However, we are statistically underpowered to detect a difference in actual donations. More evidence is needed.

Chloe: those emails that appealed to emotional storytelling performed better (higher in-email click rate) than those that were impact-focused.

DR, update: I confirm that this is indeed the case, and this is statistically significant in further analysis.

Evidence on donations

(preliminary; we are awaiting further donations in the giving season) ...

This is 'hard-coded' below. I intend to replace this with a link or embed of a dynamic document (Rmarkdown). The quantitative analysis itself, stripped of any context and connection to OftW, is hosted HERE

Note: We may wish to treat the 'email send' as the denominator, as the differing subject seemed to have led to a different number of opens

Treatment 1 (Impact): We record

1405 unique emails listed as opening a ‘control’ treatment email
29 members clicking on the donation link in an email at least once (2.1% of openers)
15 members making some one-time donation in this period (about 0.11% of openers, 0.075% of total)
8 members emails donating (likely) through the link (0.057%/0.04%)

Treatment 2 (Emotional storytelling):

1190 unique emails listed as opening an email (a significantly lower 'open rate', assuming the same shares of members were sent each set of treatment email)
56 members clicking on the donation link in an email at least once (4.7% of openers)
11 members making some one-time donation in this period (about 0.9% of openers, about 0.055% of total)
9 unique emails donating (likely) through the link (0.08%/0.045%)

Note: We may wish to treat the 'email send' as the denominator, as the differing subject seemed to have led to a different number of opens

‘Initial impressions of preliminary outcomes’

The conversion rates are rather low (0.5%) … but maybe high enough to justify sending these emails? I’m not sure.
While people are more likely to O_pen_ at least one Impact email, they are more likely to Click to donate at least once if assigned the Emotion email
But we can't say much for actual donations.
Given the low conversion rates we don’t have too much power to rule out ‘proportionally large’ differences in conversion rates (or average amounts raised) between treatments …

The figure above seems like a good summary of the ‘results so far’ on ‘what we can infer about relative incidence rates’, presuming I understand the situation correctly …I plot Y-axis: ’how likely would a difference in donations ‘as small or smaller in magnitude’” than we see in the data between the incidence … against X-axis: if the “true difference in incidence rates” were of these magnitudes

Implementation and management: Chloe Cudaback, Jack Lewars

Our data is consistent with ‘no difference’ (of course) … but it's also consistent with ‘a fairly large difference in incidence’
E.g., even if one treatment truly lead to ‘twice as many donations as the other’, we still have a 33% chance or so of seeing a difference as small as the one we see
We can reasonably ‘rule out’ differences of maybe 2.5x or greater
Main point: given the rareness of donations in this context, our sample size doesn’t let us make very strong conclusions in either direction about donations

Preregistration: OftW pre-GT

Academic-linked authors: David Reinstein, Josh Lewis, potentially others going forward

Implementation and management: Chloe Cudaback, Jack Lewars

AsPredicted questions

1) Have any data been collected for this study already?

No, no data have been collected for this study yet.

2) What's the main question being asked or hypothesis being tested in this study?

Are effectiveness-minded (EA-adjacent) donors and pledgers more motivated to donate by

"A": A (non-quantitative) mention of impact and effectiveness (in line with the standard OftW pitch)
"B": Emotional appeals and 'identified victim' images

Framing this in terms of the psychology, social science, and philanthropy literature:

"Does the Identifiable Victims Effect (see e.g., meta-analysis by Lee and Feeley, 2016) also motivate the most analytical and committed donors?"

3) Describe the key dependent variable(s) specifying how they will be measured.

d_don_specific: Whether the person receiving the series of emails makes an additional 'one time gift' following the link at OftW, within the OftW interface, during the 'Giving Season', a time-period that (for this preregistration) we declare to begin on receipt of this first email and end on 15 January 2022.
don_specific: The total amount donated through the above
don_general_gs: (If observable), the amount the person donates during the 'Giving Season', as observed through the OftW/donational/Plaid network
don_general_1yr: (If observable), the amount the person donates during the 'Giving Season' and for the following year (ending 15 January 2023) as observed through the OftW/donational/Plaid network
d_continue_pledge_1yr: Whether the person is still an active OftW pledger a year after the current giving season (15 January 2023)

4) How many and which conditions will participants be assigned to?

Two conditions (treatments):

A. "Impact"

B. "Story/Emotion"

Assignment details

Participants (c 4000 people at various points in the One for the World pledge process) will be split into groups (blocks) by previous donation behavior or point in the process. (OftW have mentioned, pledgers still in school, active donors, and lapsed donors).

Within each group, they will be randomized (selection without replacement to ensure close-to-exact shares) into equal shares in treatments A and B.

Treatment specifics (i.e., 'experimental conditions')

A series of three emails will be sent, with participants remaining in the same treatment across all three emails.

See actual texts for design and timing HERE

Example content differences, from email 1:

A. Impact version:

As of 2021, One for the World has had a tremendous impact on the lives of those that are helped by our charity Top Picks programs:

[IMPACT SINCE 2021 GRAPHIC]

B. Story/Emotion version:

Here’s our first story this season from Eunice of Kenya. When asked how her life changed when she received the first cash transfer from our partner organization, GiveDirectly, she responded”

“I have been able to make new goals and achieve them since I started receiving this money [from GiveDirectly]. I have been able to buy a piece of land that would have taken [me] many years to earn [enough to buy the land]. I was also able to buy livestock, like goats. I have even managed to dress my family properly by buying them decent clothing. Lastly, I have even been able to [pay my children’s] school fees without any strain.” (Source GiveDirectlyLive)

[PICTURE OF EUNICE]

5) Specify exactly which analyses you will conduct to examine the main question/hypothesis.

We will report all of the following analyses, with our preferred method in bold:

Binary outcomes:

Fisher's exact test
Bayesian Test of Difference in Proportions (as in here), with an informative beta distribution for the prior over the incidence rate in each treatment, with a parameter based on the incidence rates for similar campaigns in the prior 2 years.

Continuous outcomes:

Standard rank-sum tests (Mann–Whitney U test)
Simulation/permutation based tests for whether the mean (including 0's) is higher in group A or B (including 0's)
... same for median, but medians will almost always be 0, we anticipate
T-test with unequal variance

All tests will be 2-sided.

We will also report Bayesian credible intervals and other Bayesian measures for the proportion tests. We may also explore Bayesian approaches for the continuous outcomes, e.g., Bayesian beta regression.

We also anticipate reporting multiple-hypothesis-test corrections, but we are not pre-registering a method. Our approach to this is likely to follow that of List et al (2017), which this paper applied to a similar domain (charitable giving experiments with multiple donation-related outcomes).

We will report confidence intervals on our results as well as Bayesian credible intervals under flat and weakly informative priors. Where we have a 'near-zero' result, we will try to put reasonable bounds on it to convey the extent of our certainty that the true effect or parameter was fairly small.

Where situations arise that have not been anticipated in our preregistration and pre-analysis plan, we will try to follow the Don Green lab standard operating procedures unless there is a very strong reason to deviate from this, which we will specify.

6) Describe exactly how outliers will be defined and handled, and your precise rule(s) for excluding observations.

Included: All individuals who received this mailing.

We will not exclude any observations from the sample, unless they make it clear to us that they are aware of this trial.

We will not Windsorise or exclude outliers.

7) How many observations will be collected or what will determine sample size?

A series of three campaign emails will be sent out by OftW to their regular email lists, to roughly 4000 participants, as described above

Targeted dates: November 10, November 18, November 23, all in 2021, but these may be delayed for feasibility

Other

Anything else you would like to pre-register? (e.g., secondary analyses, variables collected for exploratory purposes, unusual analyses planned?)

Exploratory and secondary hypotheses/questions/analyses

Secondary hypotheses and questions

Which treatment motivates a higher rate of...

Email open rates (note, as we have three obs per participant, we will need random effects or clustered standard errors). and
Use click rates (with same caveat)?

We consider these as secondary because the click and open rates do not necessarily strongly relate to outcomes of interest, particular among this set of already effectiveness-minded donors. These outcomes may simply reflect attention or curiosity about the content.

Exploratory: what factors (especially gender, university/student status, university subject) predict which treatment leads to greater donation (incidence and amount)

Note that our partner is planning to use this trial to inform future trials and experiments, particular for the 'Giving Tuesday' season itself.

Power calculations

We did not have time to do even simple power calculations before the start date of this experiment. However, we will try to conduct these before we obtain any of the data, and update this preregistration.

Preregistration: OftW pre-GT

Academic-linked authors: David Reinstein, Josh Lewis, potentially others going forward

Implementation and management: Chloe Cudaback, Jack Lewars

AsPredicted questions

1) Have any data been collected for this study already?

No, no data have been collected for this study yet.

2) What's the main question being asked or hypothesis being tested in this study?

Are effectiveness-minded (EA-adjacent) donors and pledgers more motivated to donate by

"A": A (non-quantitative) mention of impact and effectiveness (in line with the standard OftW pitch)
"B": Emotional appeals and 'identified victim' images

Framing this in terms of the psychology, social science, and philanthropy literature:

"Does the Identifiable Victims Effect (see e.g., meta-analysis by Lee and Feeley, 2016) also motivate the most analytical and committed donors?"

3) Describe the key dependent variable(s) specifying how they will be measured.

d_don_specific: Whether the person receiving the series of emails makes an additional 'one time gift' following the link at OftW, within the OftW interface, during the 'Giving Season', a time-period that (for this preregistration) we declare to begin on receipt of this first email and end on 15 January 2022.
don_specific: The total amount donated through the above
don_general_gs: (If observable), the amount the person donates during the 'Giving Season', as observed through the OftW/donational/Plaid network
don_general_1yr: (If observable), the amount the person donates during the 'Giving Season' and for the following year (ending 15 January 2023) as observed through the OftW/donational/Plaid network
d_continue_pledge_1yr: Whether the person is still an active OftW pledger a year after the current giving season (15 January 2023)

4) How many and which conditions will participants be assigned to?

Two conditions (treatments):

A. "Impact"

B. "Story/Emotion"

Assignment details

Within each group, they will be randomized (selection without replacement to ensure close-to-exact shares) into equal shares in treatments A and B.

Treatment specifics (i.e., 'experimental conditions')

A series of three emails will be sent, with participants remaining in the same treatment across all three emails.

See actual texts for design and timing HERE

Example content differences, from email 1:

A. Impact version:

As of 2021, One for the World has had a tremendous impact on the lives of those that are helped by our charity Top Picks programs:

[IMPACT SINCE 2021 GRAPHIC]

B. Story/Emotion version:

Here’s our first story this season from Eunice of Kenya. When asked how her life changed when she received the first cash transfer from our partner organization, GiveDirectly, she responded”

“I have been able to make new goals and achieve them since I started receiving this money [from GiveDirectly]. I have been able to buy a piece of land that would have taken [me] many years to earn [enough to buy the land]. I was also able to buy livestock, like goats. I have even managed to dress my family properly by buying them decent clothing. Lastly, I have even been able to [pay my children’s] school fees without any strain.” (Source GiveDirectlyLive)

[PICTURE OF EUNICE]

5) Specify exactly which analyses you will conduct to examine the main question/hypothesis.

We will report all of the following analyses, with our preferred method in bold:

Binary outcomes:

Fisher's exact test
Bayesian Test of Difference in Proportions (as in here), with an informative beta distribution for the prior over the incidence rate in each treatment, with a parameter based on the incidence rates for similar campaigns in the prior 2 years.

Continuous outcomes:

Standard rank-sum tests (Mann–Whitney U test)
Simulation/permutation based tests for whether the mean (including 0's) is higher in group A or B (including 0's)
... same for median, but medians will almost always be 0, we anticipate
T-test with unequal variance

All tests will be 2-sided.

6) Describe exactly how outliers will be defined and handled, and your precise rule(s) for excluding observations.

Included: All individuals who received this mailing.

We will not exclude any observations from the sample, unless they make it clear to us that they are aware of this trial.

We will not Windsorise or exclude outliers.

7) How many observations will be collected or what will determine sample size?

A series of three campaign emails will be sent out by OftW to their regular email lists, to roughly 4000 participants, as described above

Targeted dates: November 10, November 18, November 23, all in 2021, but these may be delayed for feasibility

Other

Anything else you would like to pre-register? (e.g., secondary analyses, variables collected for exploratory purposes, unusual analyses planned?)

Exploratory and secondary hypotheses/questions/analyses

Secondary hypotheses and questions

Which treatment motivates a higher rate of...

Email open rates (note, as we have three obs per participant, we will need random effects or clustered standard errors). and
Use click rates (with same caveat)?

Exploratory: what factors (especially gender, university/student status, university subject) predict which treatment leads to greater donation (incidence and amount)

Note that our partner is planning to use this trial to inform future trials and experiments, particular for the 'Giving Tuesday' season itself.

Power calculations

Pre-giving-tues. email A/B

Context: Donation 'upsell' to existing pledgers

Question: Are effectiveness-minded (EA-adjacent) donors and pledgers more motivated to donate by

"A": (non-quantitative) presentation of impact and effectiveness (as in standard OftW pitch)
"B": Emotional appeals and 'identified victim' images

Further information on experiment and outcomes in in-depth replicable analysis, organized in dynamic document here

General idea, main 'hypothesis'

Are effectiveness-minded (EA-adjacent) donors and pledgers more motivated to donate by

"A": (non-quantitative) presentation of impact and effectiveness (as in standard OftW pitch)
"B": Emotional appeals and 'identified victim' images

In the context of One for The World's (OFTW) 'giving season upselling campaign', potentially generalizable to other contexts.

Academic framing: "Does the Identifiable Victims Effect (see e.g., the meta-analysis by Lee and Feeley, 2016) also motivate the most analytical and committed donors?"

Background and context

One for The World's (OFTW) 'giving season upselling campaign''

10 emails total over the course of November were sent in preparation for GivingTuesday

Point of contact (at organization running trial)

Academic-linked authors: David Reinstein, Josh Lewis, and potentially others

Timing of trial

: November 10, 18, 23, all in 2021, but may be delayed for feasibility

Digital location where project 'lives' (planning, material, data)

Present Gitbook, Google doc linked below, preregistration (OSF), and github/git repo

Environment/context for trial

Emails ... to existing OftW pledgers (asking for additional donations in Giving Season)

All 10 emails had the same CTA: make an additional $100 donation for the giving season/GivingTuesday on top of their recurring monthly pledge donation.

Participant universe and sample size

Roughly 4000 participants, as described.

A series of three campaign emails will be sent out by OftW to their regular email lists, to roughly 4000 participants, as described.

Key treatment(s)

A list of ~4500 contacts (activated pledgers) was split into two treatment groups.
Treatment Group A received emails that were focused on the contact's impact
while Treatment Group B received emails that were focused on individual stories of beneficiaries

See preregistration, treatment specifics

Treatment assignment procedure

See preregistration How many ... conditions

Outcome data

Targeting: Donation incidence and amount in the relevant 'giving season' and over the next year, specifically described in prereg under

key dependent variable

Data storage/form:

MailChimp data (Chloe is sharing this),
Reports on donations (Kennan is gathering this)

Optional/suggested additions

Planned analysis methods, preregistration link here

Cost of running trial/promotion: Time costs only (as far as I know)

Proposed/implementing design (language)

(Link)

Pre-registration work

Pre-registered on OSF in 'AsPredicted' format, content incorporated here here

https://github.com/daaronr/effective_giving_market_testing/blob/main/contexts-and-environments-for-testing/one-for-the-world/preregistration_oftw_pre_gt.pdfgithub.com

Preliminary results

Overview:

Chloe: those emails that appealed to emotional storytelling performed better (higher in-email click rate) than those that were impact-focused.

DR, update: I confirm that this is indeed the case, and this is statistically significant in further analysis.

Evidence on donations

(preliminary; we are awaiting further donations in the giving season) ...

Note: We may wish to treat the 'email send' as the denominator, as the differing subject seemed to have led to a different number of opens

Treatment 1 (Impact): We record

1405 unique emails listed as opening a ‘control’ treatment email
29 members clicking on the donation link in an email at least once (2.1% of openers)
15 members making some one-time donation in this period (about 0.11% of openers, 0.075% of total)
8 members emails donating (likely) through the link (0.057%/0.04%)

Treatment 2 (Emotional storytelling):

1190 unique emails listed as opening an email (a significantly lower 'open rate', assuming the same shares of members were sent each set of treatment email)
56 members clicking on the donation link in an email at least once (4.7% of openers)
11 members making some one-time donation in this period (about 0.9% of openers, about 0.055% of total)
9 unique emails donating (likely) through the link (0.08%/0.045%)

Note: We may wish to treat the 'email send' as the denominator, as the differing subject seemed to have led to a different number of opens

‘Initial impressions of preliminary outcomes’

The conversion rates are rather low (0.5%) … but maybe high enough to justify sending these emails? I’m not sure.
While people are more likely to O_pen_ at least one Impact email, they are more likely to Click to donate at least once if assigned the Emotion email
But we can't say much for actual donations.
Given the low conversion rates we don’t have too much power to rule out ‘proportionally large’ differences in conversion rates (or average amounts raised) between treatments …

Implementation and management: Chloe Cudaback, Jack Lewars

Our data is consistent with ‘no difference’ (of course) … but it's also consistent with ‘a fairly large difference in incidence’
E.g., even if one treatment truly lead to ‘twice as many donations as the other’, we still have a 33% chance or so of seeing a difference as small as the one we see
We can reasonably ‘rule out’ differences of maybe 2.5x or greater
Main point: given the rareness of donations in this context, our sample size doesn’t let us make very strong conclusions in either direction about donations