So Florian Zettelmeyer describes a key challenge businesses face in their marketing efforts: understanding return on advertising investment, especially for digital ads served on platforms like Google and Facebook.
“Advertisers basically want to know what happens to someone who sees the ad compared to someone who doesn’t—that’s the causal effect of the ad, which translates directly into a return on investment for the money deposited,” explains Zettelmeyer, a professor marketing at the Kellogg School of Management. “But the problem is that because of algorithmic targeting, the people who see the ads are extremely different from the people who don’t.”
He and fellow marketing professor Kellogg Brett Gordon have long recognized that traditional ad measurement techniques won’t work because of the confounding effects of online ad targeting. For example, new car ads may target people who have recently searched online for specific car models or features, which suggests that these consumers are already considering a purchase and makes it difficult to tamper with the impact of the ad itself.
The best way to understand the causal effect of digital ads is through randomized controlled trials (RCTs), in which a randomly selected group of consumers is shown an ad and compared to a randomly selected control group that does not see the ad. The difference in response between these groups reflects the impact of advertising. However, RCTs are expensive to run at scale because advertisers must exclude large numbers of potential buyers from being exposed to their campaign by placing them in control groups for each ad. “You can lose a significant share of the target audience in control groups,” says Zettelmeyer.
In a recent studyGordon, Zettelmeyer and co Robert Moakler of Meta devised a possible solution to this problem. They designed and tested a model that predicts the causal impact of an ad based on a small number of RCTs, combining these results with data on industry-standard measures such as the number of last-click conversions. This enables advertisers to predict the incremental impact of advertising for campaigns that are not implemented as RCTs.
“Our model allows advertisers to use the data they have I am doing we have to predict for each campaign how well the campaign would do in terms of its actual causal effect,” says Zettelmeyer.
Predicting advertising impact
It turns out that just a small number of RCTs can get advertisers pretty far in estimating the return on ad dollars.
Gordon, Zettelmeyer, and Moakler came to this conclusion by studying data from nearly 1,000 RCTs run on a random subset of Facebook ads from November 2019 to March 2020 that targeted at least 1 million users. Products advertised included products from retail, financial services, consumer goods and other sectors. The researchers focused on conversions in the sales funnel, measuring outcomes such as viewing an advertiser’s website, adding a product to a digital cart, or making a purchase.
The researchers then took this RCT data on ad effectiveness—the most reliable estimate of conversions driven by an ad—and compared it to the effectiveness predictions generated for those same ads through leading proxy metrics such as last click. When looking at ads on a single platform, such as Meta, last-click metrics refer to the count of purchases or other outcome events that occurred within a specified time frame, such as a day or week, since the last exposure to an ad on that platform.
Their aim was to calibrate efficacy estimates given by proxy measures, which are less reliable, against RCT results, which are much more reliable. If RCT results and other measures correlate in predictable ways, it will allow advertisers to obtain highly accurate assessments of ad effectiveness without having to run RCTs on every ad.
And that’s exactly what they discovered. The model showed that measures such as last click generally tended to over- or underestimate effectiveness – but in a reliable way. This means that although these proxy measures do not always have much predictive value on their own, they can be very useful when correlated with RCT results using even simple statistical measures.
This finding allows advertisers and platforms to measure the effectiveness of ads using only a small number of RCTs. RCTs are used to generate the calibration factor, which is then applied to the estimates produced by the last click and other measures.
Gordon points out that many advertisers have already thought along these lines: “They might run an RCT and estimate an incremental effect, but then their last click metric shows it’s double. … So they take all the measures of their last click and divide it by two, even when they’re not running an experiment. But it’s not systematic or rigorous.” The approach formulated by the researchers provides the required rigor.
A convenient, cost-effective approach
The good news for advertisers and platforms is that most already have usable data from proxy metrics like last click. This means they may not need to spend a lot on RCTs to make impact predictions at scale.
“Last click metrics and other proxy metrics are simple and easy to use for most advertisers,” says Gordon. “They have very low data or analytics requirements, and even when privacy policies change, they will usually retain some access to those metrics.”
So how many RCTs do you need to provide good predictions of advertising impact?
It depends on whether you are an advertising platform or an advertiser. “For platforms, we would suggest running thousands of RCTs and using a machine learning model to make predictions. And if you’re an advertiser, then we suggest you run at least a handful of RCTs and then see if you can create a very simple calibration factor between the RCT result and the last click result,” says Zettelmeyer.
Similarly, a consortium of advertisers could pool RCT and last-click results from multiple ad campaigns to create a common predictive model that would provide calibration factors to apply to proxy measures of ad effectiveness.
As Zettelmeyer concludes, “We want advertisers to know that to improve their marketing metrics, they don’t need to run randomized experiments all the time, but they should run them at some point. It won’t provide the same accuracy if they did an experiment on everything, but [it gets you much of the way there] with much less effort and cost.”