arrow-left

All pages
gitbookPowered by GitBook
1 of 5

Loading...

Loading...

Loading...

Loading...

Loading...

Pledge page (options trial)

The presentation of options on GWWC's 'pledge pagearrow-up-right' were randomly varied at the individual browser level over a certain period to see which option increased pledges.

A summary of this has been shared as a postarrow-up-right on the EA Forum.

chevron-rightNotes on content/building thishashtag

This follows the Trial reporting template, edited slightly for public reading.

We intend to redo and augment much of this analysis in a more transparent way; directly importing the data and doing our own analyses ....rather than Google's built-in tools. We intend to put this within the

hashtag
Summary of trial and results

Giving What We Can (GWWC) has three giving pledge options, displayed in the 'Original presentation version' below.

From April-July 2021 they ran a trial presenting its 'pledge page' options in three slightly different ways. Considering 'clicks on any button' as the outcome, and a Bayesian 'preponderance of evidence' standard...

  • "Separate Bullets for Other Pledges" was the most successful presentation. It only showed a box for "The Pledge", with the other options given in less prominent bullet points below. This had about a 20% higher incidence rate than the Original presentation.

  • "Pledge before Try Giving" was the least successful presentation this was like the one displayed above, but with "Try Giving" in the central position. This had about a 23% lower incidence rate than the Original presentation.

These results may only apply narrowly to the GWWC pledge case, and even here, we have some . However, it loosely suggests that when making a call to action, it may be most effective to present the most well-known and expected option most prominently, and not to emphasize the range of choices (see further below).

Getting people to take the GWWC pledge may be seen as an important outcome on its own. It may have a causal impact on getting people engaged in the Effective Altruism community and other EA activities, such as EA career impact decisions.

hashtag
General idea and main hypothesis

GWWC: How can we present pledge options to maximize positive outcomes (pledges, fulfillment)?

General: For those considering making substantial giving pledges (of a share of their income), how does the presentation of these 'pledge options' matter?

Theories and mechanisms to consider:

  • Tendency to choose 'middle options'

  • Too many options may lead to 'indecision paralysis'

  • The signaling power of choice; e.g., if there's a 'more virtuous choice' I may feel that my 'middle choice' looks less good by comparison

hashtag
Background and context

GWWC has three distinct pledge options, as shown

1. "Try Giving" (1% of income),

2. "The Pledge" (10% of income)

3. The "Further Pledge" (donate all income above a living allowance).

These can be seen on the 'pledge page' (link from October 2020).

Three versions of this page were randomly presented (between 19-21 April and 10 July 2021)

The content of the key 'choice button' part varied between these three versions

  1. "Original:" A block of three (in the order of commitment) 'The Pledge' (10%) in the center and highlighted (see above)

  2. "Pledge before TryGiving": A block of 3 with "Try Giving" (1%) in the center and highlighted

  3. "Separate Bullets for Other Pledges": A single block for 'The Pledge' (10%), with the other pledges given as clickable bullet points below (as well as a bullet for the 'company pledge' ... which had a different presentation in other versions)

The version presented stayed constant according to an individual's IP cookie tracking.

chevron-rightPoints of contact, Timing of trial, Digital location of project/data, Environment hashtag

hashtag
Points of contact

Julian Hazell (julian.hazell at givingwhatwecan.org), Luke Freeman

hashtag
Participant universe and sample size

  • 'Everyone going to the above page' within the above time duration.

  • People interested in GWWC pledges'

Sample size: see below, from Google Analytics

hashtag
Key treatment(s)

  1. "Original" (Block of 3 in order of commitment, Middle Pledge in Center)

2. "Pledge before TryGiving" ... as above but with Try Giving and The Pledge swapped, and Try Giving (in the center) highlighted

3. "Separate Bullets for Other Pledges" (see below)

hashtag
Treatment assignment procedure

  • Three versions of this page were randomly presented

  • Equal likelihood of assignment

chevron-rightThe non-exact balance below seems an imbalance in 'sessions' not in 'participants'.hashtag

Our analysis should focus on outcomes per participant; thus, the figures below may need some adjusting (although at first pass, the results go in the same direction). This doesn't seem to be adaptive assignment. In Google's help on 'create an A/B test' they state:

All variants are weighted equally by default in Optimize. A visitor who is included in your experiment has an equal chance of seeing any of your variants.

The version presented stayed constant for each individual across visits.

hashtag
Outcome data

Statistics on Google Analytics: This records only 'pressed any button' (any pledge) as the successful outcome.

chevron-rightIdeally, for future trials, this would include...hashtag

One entry per page view over the interval, detailing

  • Whether pledged

hashtag
Ex-post: Reporting results (brief)

hashtag
Implementation and data collection

See for details on data extraction from the interface

  1. From shared image from Google Analytics:

'Experiment sessions' (observations) by treatment (as labeled on Google Analytics shared image):

Original: 2588

Pledge before Try Giving: 2686

Separate Bullets for Other Pledges: 2718

Total: 7992 sessions (=2588+2686+2718)

3. Where is the data stored ... [noted above]

hashtag
Basic results/outcomes

hashtag
Quick interpretation

The "separate bullets for other pledges" seems to have been the most successful, with an 0.49% higher (percentage point) incidence rate than the 'Original', i.e., a 22% higher rate of pledging (2.69 vs 2.20).

These differences seem unlikely to be statistically significant in a conventional sense. Still, Google analytics (presumably a reasonable Bayesian) model states an 80% chance that this is the best treatment, and this seems useful and informative.

chevron-rightIf anything, these result for 'separate bullets' seems potentially understated...hashtag

Note that GA is reporting conversions based on sessions (contiguous use periods) and not users. We can reasonably assume that a roughly equal number of users were assigned to each treatment (as per the design). As a result, we assume that roughly equal shares 'viewed the relevant page at least once' (because of the law of large numbers). However, the most successful treatment, the 'Separate block', is recording more sessions. Thus, the relative conversion rate, as a share of users, would be even higher than the one reported here, relative to the baseline.

__

chevron-rightAside on statisticshashtag

Optimize uses Bayesian inference to generate its reports... Optimize chooses its priors to be quite uninformed.

DR: But this still doesn't tell us what these priors are. There's a lot of sensitivity to this choice, in my experience.

Dillon: there is possibly a more sophisticated approach to this than what Google is doing ... the better prior is an 'empirical Bayes' approach (but it may be controversial). See to empirical Bayes

The "Pledge Before Try giving" treatment performed substantially worse than the original.

chevron-rightThe poor performance of ‘pledge before try giving’ ...hashtag

The poor performance of ‘pledge before try giving’ appears even more substantial than the strength of ‘Separate Block’. It even seems to border on conventional statistical significance … I expect that in a standard comparison of the latter two treatments, we’d find conventional statistical significance.

hashtag
These differences are meaningful–consider the 'posteriors':

Downloading the 'Analytics data' behind the above graphs, we see:

Variant
2.5th Percentile Modeled Improvement
25th Percentile Modeled Improvement
Modeled Improvement
75th Percentile Modeled Improvement
97.5th Percentile Modeled Improvement

This suggests it is very reasonable to think that 'Separate Bullets' is substantially better

Our 'posterior' probability thus infers that we should put

  • a 2.5% chance that 'Separate Bullets' (SB) has an 18% (or more) lower conversion rate than 'Original'

  • a 22.5% chance on SB being between 18% worse and 4% better

  • a 25% chance of SB being 4-20% better

We can also combine intervals, to make statements like ...

  • a 50% chance of being 4-36% better

  • a 50% chance of being 20-76% better

For 'Pledge before...' (PB) we can state, e.g.,

  • PB has a 75% chance of being at least 11% worse than Original

  • and a 50% chance of being at least 23% worse than Original

hashtag
Intuitive interpretation

Perhaps giving people more options makes them indecisive. They may be particularly reluctant to choose a “relatively ambitious giving pledge” if a less ambitious option is highlighted.

This could also involve issues of self and social signaling. If the 'main thing' to do is a 10% pledge (as in "separate bullets"), then this may seem a straightforward way of conveying 'I am generous'. On the other hand, if the 'Further pledge' is fairly prominent, perhaps the signal feels less positive. And if the '1% pledge' is made central, 10% might seem more than a necessary signal.

The "pledge before try giving" may perform the worst because it makes the 'Try Giving' pledge a particularly salient alternative option. (In contrast, the "Original" at least makes 'The 10% Pledge' the central and the middle option.)

chevron-rightBut in this case, why should the overall pledge rate (any button-press) be lower with more options (Original vs 'separate bullets'), and lower still when Try Giving is made central?hashtag

It's hard to say too much if we don't know the composition of the pledges people make.

Still, it might be that people mainly came in with the desire to take The Pledge (10%), as this is most heavily promoted. In such a case, making other pledge possibilities prominent may A. Cause people to rethink their choices and delay a decision (perhaps never returning) and/or B. Feel less comfortable with the overall 'signal' their pledge will send. This doesn't mean that the 'multiple boxes' environment are worse overall, but it may perform worse for those people coming here, as these were the people particularly attracted by the '10% is the main thing' signaling environment.

hashtag
Caveats

I am assuming that the 'outcome being measured here' is whether the person 'clicked on any giving pledge'; this is what Luke has conveyed to me

I assume this is 'conversions ever from this IP', and 'sessions' represents 'how many different IPs came to the treatment'. If it's something else (e.g., each 'session' is a 'visit' from an individual), this could reflect these people converting in fewer sessions but not necessarily being more likely to convert overall. Even if this is 'by IP' the alternative interpretation 'not converting now but maybe later' may still have some weight if people are entering through multiple devices.

chevron-rightWe should try to focus more carefully on 'whether this is having any effect on ultimate pledge-taking and pledge-follow-through behavior'.hashtag

I would be surprised if a moderate difference in the framing of a particular page should have such a large (2.69-1.71/1.71 = 57%) impact on the incidence of such a large life choice, involving at least tens of thousands of dollars. However, I still expect the incidence of 'click this button' to be likely related to that ultimate outcome, thus I suspect these results are still informative and useful as they stand.


'Academic' contact: David Reinstein.

hashtag
Timing of trial (when will it/did it start and end, if known)

Start: 19 April 2021 (or 21 April)? End: 10 July 2021 (Source: Google Analytics)

hashtag
Digital location where project 'lives'

(Planning, material, data)

Statistics are available on Google Analytics/Optimizely. Reinstein has access to this and, is planning to input into R for more detailed analysis, to be reported in the analysis web bookarrow-up-right.

The present document is currently (11 May 2022) the only writeup.

hashtag
Environment/context for trial

https://www.givingwhatwecan.org/pledge/ ... see above

Variation in the presentation of the pledge options
Which pledge
  • Time and date of view, Time spent on page, Other clicks, Location of user, Any other information about user

  • Most importantly:

    • Number of page views over the interval, by treatment

    • Number of pledges over the interval

      • by treatment

      • by type of pledge

    • Follow-up donations etc (if connectable)

    -50%

    -33%

    -23%

    -11%

    18%

    Separate Bullets For Other Pledges

    -18%

    4%

    20%

    36%

    76%

    a 25% chance of SB being 20-36% better

  • A 22.5% chance of SB being 36-76% better

  • A 2.5% chance of SB being more than 76% better

  • Original

    0%

    0%

    0%

    0%

    0%

    EAMT Analysis web-book here.arrow-up-right
    caveats
    discussion
    (Simonson and Tversky 1992)arrow-up-right
    above
    here arrow-up-right
    Google A/B, optimize interface
    this guidearrow-up-right
    Pledge page "Original"
    performance of three versions, shared from Google Optimize

    Pledge Before Try Giving

    Message Test (Feb 2022)

    hashtag
    Summary

    Main Question: Do some message themes work better than others for drawing visitors to Giving What We Can’s landing page?

    Main findings: 'Social proof messages' on Facebook ads were most effective at generating landing page views per dollar compared to other message themes (effectiveness, services, giving more, and values).

    Future directions: There were significant differences in 'link clicks per dollar' on the different messages by age. We recommend a systematic test to determine if age makes a difference in the relative effectiveness of social proof and values messages. Future studies could explore why the social proof message was more effective in this study than the previous giving guide study and the importance of the message to “join” the movement as social proof.

    Possible connection between this trial and the : Note that the two best-performing messages both prompted the user to “join” a movement or a group of people (perhaps an elite group); but beware .

    to report below.

    hashtag
    Pre-trial reporting template

    hashtag
    General idea, main 'hypothesis' (if there is one)

    In this test, we are aiming to find out if one 'theme' of messages resonates better with our target audience than others.

    If we knew which 'themes' were most effective with our advertising, then we could create more ads on this theme and improve our conversion.

    Specifically, which of the following themes resonate with our target audience the most:

    • effectiveness

    • giving more

    • social proof

    On choosing an objective of this test, originally I planned to use link clicks, but this is not the most high quality indicator of conversion, and when I tried to use newsletter signups Facebook warned me that I might not see any conversions at all... So instead, the campaign will optimise for landing page views, which is slightly better than a link click and will generate enough conversions that we should [see?] we statistically significant results.

    hashtag
    Point of contact (at organization running trial)

    Grace Adams

    hashtag
    Timing of trial (when will it/did it start and end, if known)

    Trial will run for 7 days on GWWC's ad account, from 9.30am AEDT Friday 25 Feb to 9.30am AEDT Friday 4 Mar.

    hashtag
    Digital location where project 'lives' (planning, material, data)

    Working document can be found but all important details will be listed in this brief

    hashtag
    Environment/context for trial

    This test will take place on Meta platforms including Facebook and Instagram

    hashtag
    Participant universe and sample size

    • We are targeting a "Philanthropy and Lookalikes (18-39)" audiences, based in UK, US or Netherland

    • Estimates from Facebook: Reach is expected to be 1.4K-4.1K per day (7 days) per ad set (5 ad sets) = 49K-143K

    • Estimates from Facebook: Conversion is expected to be 10-30 landing page views per day (7 days) per ad set (5 ad sets) = 350-1050

    hashtag
    Key treatment(s)

    We are using the GWWC Brand Video by Hypercube as the creative across all tests. Although it did not perform as well as our other ads in the Giving Guide campaign, I think that it will interfere less with our messages we aim to test.

    We are going to test a set of messages for each theme, please see them in the

    Mock up of ad:

    hashtag
    Treatment assignment procedure

    • This test has been set up as an A/B test through Facebook, testing each campaign head to head, each campaign covers one theme, with the different ads as a child.

    • This will allow us to test which theme was better, not just which individual ads

    • A/B testing on facebook will ensure that the audiences fall into an individual treatment group

    hashtag
    Outcome data

    • Primary measure will be cost per landing page view, but secondary measures such as CPC, 3 second video plays, email sign ups will also be tracked

    • Data will live on Meta ads platform


    values
  • services

  • Giving guides - Facebook
    ex-post theorizingarrow-up-right
    Linkarrow-up-right
    here arrow-up-right
    google doc linkedarrow-up-right

    YouTube Remarketing

    GWWC youtube remarketing campaign (trial)

    See also the cross-organization notes on advertising, google, youtube, etcarrow-up-right (=placeholder for now) and the tips on Doing and funding ads

    hashtag
    YouTube Remarketing

    July 20, 2021: GWWC launched a YouTube remarketing campaign. That means that when someone goes to the GWWC website, leaves, and then goes to YouTube we show them one of the following videos:

    Algorithm decides which video to present to people.

    hashtag
    Understanding assignment, proposing experimental design ’s questions:

    Q: Is each video assigned to a different situation or are videos randomly chosen to be displayed? If the latter, you could randomize videos by location and see if the different videos were more or less effective. Alternatively, just randomizing the whole campaign seems like a good idea to me....

    A: Videos are selected based on the likelihood of the user watching >30 seconds (by the algorithm) ... randomization by individual will be hard because users don't click and act right away. Instead I think we have to randomize by geography

    hashtag
    Results summary (Early, JS Winchell; may need update)

    Most important takeaway: It costs $1 to get a website visitor to watch 1h of your videos! High level metrics

    • Cost: $205

    • Views: 6,071 (a view is when a user chooses to watch >30s of an ad)

    • Total watch time: 223 hours (~$1/h)

    Interesting observations\

    1. Efficiency has significantly improved over 3 weeks

    • Cost per view has gone down from $0.05 per >30s view to $0.02 per >30s view

    • Views have increased 75% without increasing budget (from 220/day to start to 386 yesterday)

    2. 10% of the time people watched the full video! \

    • You can see this data by video if you are interested to control for video length

    • E.g., 5% of people chose to watch the entire 13 minutes of _

    3. Your best video had a view rate (% of time people choose to watch >30s) twice as good as your worst video 4. You can see view rate by age, gender, and device in the "Analytics" tab

    • For the , older people and men were more likely to choose to continue watching

    Possible next steps

    • Could add "similar audiences" which is when we let Google use machine learning to find people similar to your website visitors and also show ads to them

    • Could walk David Reinstein and Joshua Lewis through the UI so they can get a sense of the metrics/reporting available and how it could be used for research

    Giving What We Can

    Luke Freeman is the lead contact.

    Giving What We Can's mission is to make giving effectively and significantly a cultural norm. has updated their. They are looking to significantly increase their marketing activity by producing videos, funding ads, and conducting systematic and robust research. As such there will be a large crossover between our work and theirs. This section highlights our collaborative efforts.

    hashtag
    Presentation: overview

    Giving guides - Facebook

    Along with GWWC, we tested marketing and messaging themes on Facebook in their Facebook Lead campaigns. Across four trials we compared the effectiveness of different types of (1) messages, (2) videos, and (3) targeted audiences.

    A summary of this has been shared as a on the EA Forum:

    We build the results and analysis transparently in the

    hashtag

    Example 4arrow-up-right
  • Example 5arrow-up-right

  • Example 6arrow-up-right

  • Example 7arrow-up-right

  • Unique viewers: 4,937 (this is an estimate)
  • Average impressions per user: 5.8

  • View rate: 20% (20% of the time people choose to watch more than 30s)

  • CTR: 0.37%

  • Average CPC: $1.83

  • Conversions (users spending >30s on the website): 2

  • Thinking: 'This is not a good tactic for driving site traffic or donations (although we could optimize for this instead if we wanted)'

  • Example 1arrow-up-right
    Example 2arrow-up-right
    Example 3arrow-up-right
    @Joshua Lewisarrow-up-right
    this videoarrow-up-right
    13m videoarrow-up-right
    hashtag
    Ideas and opportunities

    We want to learn from existing work, run tests on the GWWC platform, and support research into this.

    hashtag
    Stages of the funnel:

    1. Awareness & Consideration

      Increase casual visitors and raise curiosity

    2. Conversion & Acquisition

      Donate or pledge to donate

    3. Retention

      Fulfill and report pledge

    4. Advocacy

      Promoting GWWC to others

    hashtag
    Some key questions

    • “What should the call to action be for the casual person in the funnel?”

    • Testing all parts of funnel/pledge journey; website, welcome messages/welcome packages, reminders and thank-you's

    hashtag
    Completed studies: See sections below

    GWWCarrow-up-right
    2022 strategyarrow-up-right
    Summary

    Context: Facebook ads on a range of audiences

    ... [with text and rich content promoting effective giving and a "giving guide" -- links people to a Giving What We Can pagearrow-up-right asking for their email in exchange for the guide]

    Objective: Test distinct approaches to messaging, aiming to get people to download our Giving Guide. A key comparison: "Charity research facts" vs. "cause focus".

    Also informative about costs and the 'value of targeting different groups' in this context.

    Key findings:

    • The cost of an email address via a Facebook campaign during Giving Season was as low as $8.00 across campaigns.

    • “Only 3% of people give effectively,” seems to be an effective message for generating link clicks and email addresses, relative to the other messages.

    • Lookalike and animal rights audiences seem to be the most promising audiences.

    • Demographics are not very predictive on a per-$ basis.

    hashtag
    Key caveats

    Specificity and interpretation: All comparisons are not for 'audiences of similar composition' but for 'the best audience Facebook could find to show the ads, within each group, according to its algorithm'. Thus, differences in performance may combine 'better targeting' with 'better performance on the targeted group'. See our discussion of the 'divergent delivery' problem HEREarrow-up-right. I.e., we can make statements about "what works better on Facebook in this context and maybe similar contexts", but not about "which audience, as defined, is more receptive", as the targeting within each audience may differ in unobserved ways.

    The outcome is 'click to download the giving guide'.

    chevron-rightPrevious writeup and resultshashtag

    Linkarrow-up-right to the previous Gdoc report

    Effective Giving Guidearrow-up-right
    postarrow-up-right
    EAMT Analysis web-book here.arrow-up-right
    Marketing Messages Trial for GWWC Giving Guide Campaign — EA Forumforum.effectivealtruism.orgchevron-right
    Logo
    image.png
    image.png