Chat with us, powered by LiveChat only the first question should be answered 1. The field experiment reduced Nocturnal’s advertising b - STUDENT SOLUTION USA

only the first question should be answered

1. The field experiment reduced Nocturnal’s advertising budget, but the field experiment could have been designed to maintain overall advertising levels. Design a field experiment to test return on ad spending without reducing total ad budget. To simplify your task, design the field experiment to test only two ad channels—Google Prospecting and Facebook Prospecting.

2. Assume Nocturnal’s budget allocation was $4 million per month—$1 million for each channel. How would allocate that $4 million across the four channels in the future?

3. Nocturnal is eager to conduct additional field experiments to learn more about its advertising. What would you advise they test next? (You can test any aspect of its digital advertising on these four channels or other digital channels. That includes ad budget, ad copy, ad settings, and more.)

4. A mattress is a high-involvement purchase with a long purchase cycle. Consider a product with a shorter purchase cycle, like shoes or jewelry. Do you believe a similar field experiment on this kind of product would yield similar results?


Advertising Attribution through Experimentation Case

As head of analytics at Nocturnal[footnoteRef:1], the bed-in-box brand, Scott Clark was familiar with the many logistical challenges of selling, manufacturing, and shipping large mattresses across the country. He himself had run the numbers that showed that their new advertising spot, which had produced a 20% sales increase, was also producing a 50% increase in expensive returns, and thus should be stopped. Scott made it his business to ensure that analytics were not just being used to track advertising and sales, but also logistics and manufacturing expenses. So he wasn’t surprised when Nocturnal’s CEO announced at the end of Q3 board meeting that Nocturnal needed to reduce advertising expenditures to enable manufacturing to catch up with their growing sales. Nocturnal’s manufacturing typically worked at a 6-day lag to the orders they were filling. But over the last several months, that lag had grown to 12 days, and the longer delay between order placement and order fulfillment was starting to affect customer reviews. The manufacturing facility was already running at full capacity, so there was no room to accelerate manufacturing. Nocturnal’s only option was to slow sales. [1: Nocturnal is a pseudonym for a real company. All events and people in the case are real.]

The prospect of a purposeful slowdown in sales gave the marketing department nightmares, because it was anathema to their entire reason for being, but Scott Clark immediately saw an opportunity. From his attendance at a variety of academic conferences on business analytics, Scott had learned about the problems of advertising attribution from observational data. Scott knew that, despite the reams of data he could see from the many advertising channels Nocturnal used, none of that data would give him an accurate report of the true return on advertising spending (ROAS) that Nocturnal had achieved. If a company wanted to achieve an accurate measurement of ROAS, it needed to conduct a carefully crafted field experiment. Companies were reticent to engage in such an experiment, because doing so would likely result in foregone sales. The current need to reduce Nocturnal’s pace of sales provided a perfect opportunity to conduct just such a field experiment and thereby get an accurate measurement of ROAS. But Scott need some help to craft the experiment.

At the conclusion of the meeting, Scott set up a conference call with Jeff Dotson and Jeff Larson, marketing professors at nearby Brigham Young University. Scott had met with these professors previously at some of the academic conferences Scott had attended, and they had expressed an interest in conducting some academic research with Nocturnal. When Scott explained the situation and his desire for help to leverage Nocturnal’s unique situation for maximum learning, the professors immediately got to work designing a field experiment to measure Nocturnal’s ROAS on multiple advertising channels.

Experiment Design. The digital marketing world is no stranger to field experiments. An advertiser on virtually any digital advertising platform can simply create two (or more) versions of any ad and thereby initiate a field experiment of the A/B testing variety to reliably determine which ad produces higher clicks and sales. Platforms like Optimizely and Google Optimize allow users to conduct A/B test field experiments on the design of their landing pages and web pages. Given these well-established field experiment capabilities, not to mention the entrenched culture of A/B testing among digital marketers, it seems reasonable to expect that conducting a field experiment to measure ROAS (Return on Advertising Spend) would be equally simple and straightforward. Unfortunately, the reality is more complicated.

One of the reasons A/B tests on ad copy can be conducted so easily and reliably is that the random assignment of people to condition (version A or version B of the ad) is easy and reliable. Every appearance of an ad is an isolated instance of the experiment. From that instance, the platform need only record which ad was shown, whether that ad impression resulted in a click, and whether it resulted in a purchase. A field experiment to measure ROAS is much more complex, for multiple reasons. First, assigning conditions to individuals cannot be accomplished reliably, because most individuals access the internet from multiple devices. If an individual using his desktop computer is assigned to a no advertising condition, his mobile device might be assigned to a high advertising condition, resulting in condition cross-contamination. Second, if an individual is exposed to ads on one device but makes a purchase on another device, the measurement of the results becomes unreliable. Third, the algorithms controlling ad exposures on an ad network are not randomized. Even if an individual is randomly assigned to a particular ad condition, his subsequent clicking on an ad or failure to click on an ad will affect whether the algorithm shows him more of that advertiser’s ads, thereby interfering with the randomized assignment. For all these reasons and more, measuring ROAS via a field experiment is not so simple as instructing an ad platform to randomly assign individuals to a high or low advertising condition.

Because of these difficulties in assigning individuals to field experiment conditions, many field experiments are conducted by assigning geographic regions to a condition. Nielsen, the analytics firm, has divided the United States into 210 Designated Market Areas (DMAs). The boundaries of these DMAs were chosen strategically to minimize comingling across regions. Very few people spend half of their day in one DMA and half in another, because each major metropolitan area is fully contained within a single DMA. If someone conducts a field experiment in which the DMA for Phoenix, AZ is assigned to one condition and the DMA for Tucson, AZ is assigned to a separate condition, almost no one will receive a mixed conditional treatment, because very few people make regular commutes between these neighboring regions.

Conducting field experiments using these DMAs has two important advantages. The first was already mentioned—very little comingling across the regions means very little condition contamination. If condition assignment in a field experiment is unreliable, then the entire field experiment is unreliable. The second advantage was especially important for the field experiment that the professors wanted to conduct—the Nielsen DMAs can be utilized on multiple advertising platforms. If Facebook draws region boundaries differently than Google does, a single field experiment on both of these platforms would be unreliable, because condition assignments would not align across the two platforms. But because both platforms utilize Nielsen DMAs, a single field experiment can include both advertising platforms.

Of course, using geography rather than individuals as the basis of field experiment conditions also has disadvantages. The primary disadvantage of geography-based condition assignment is that measurement of the experiment results is much less granular. Instead of examining the individual purchase histories of millions of individuals to measure how much advertising exposure influenced them, the experiment must measure whether the higher advertising exposure in an entire region influenced the overall region’s volume of purchases. To find reliable effects in a geography-based field experiment, the volume of advertising must be extremely high. Fortunately, Nocturnal’s typical advertising volume was high enough that the professors were confident that the field experiment would yield useful results.

After consulting with Scott Clark about the learnings that would be most beneficial to Nocturnal, and after reviewing their current advertising outlays across dozens of digital and non-digital ad channels, Professors Larson and Dotson designed the field experiment with a focus on measurement of the following advertising effects.

(1) Facebook and Google. These two ad platforms together compose the majority of online ad expenditures in the United States, both for Nocturnal and for the market in general. Because of the structure of ad and purchase tracking on these platforms, Facebook and Google increasingly claimed credit for the same purchases. Learning which platform was truly causal in producing Nocturnal’s sales would greatly benefit Nocturnal.

(2) Prospecting versus Retargeting. Prospecting ads are typically shown to a broad swath of internet users, while retargeting ads are shown only to users who have visited Nocturnal’s website on some past occasion. Because the purchase cycle for a new mattress is long, users might make several visits to the website before deciding whether to purchase a Nocturnal mattress. Nocturnal wanted to ensure that they did not lose these potential purchases to competitors, and thus advertised intensively to them. Nocturnal wanted to know whether this intensive investment in retargeting ads was worthwhile.

(3) Advertising Synergies and Saturation. If someone sees multiple Nocturnal ads on both Facebook and Google, what is the effect of this multi-source ad exposure? Do these ads have less effect because they more quickly result in over-exposure? Or does the same message on multiple platforms cause people to take notice, and thus increase their purchase interest? On another dimension, since the retargeting pool is generated from prospecting ads, does increased prospecting produce better results from retargeting?

These three considerations led the professors to design a field experiment focused on four channels: Google Prospecting, Google Retargeting, Facebook Prospecting, and Facebook Retargeting. By assigning DMAs to differing advertising intensities on these four channels, the field experiment could obtain estimates of the effectiveness of both Google and Facebook, of both prospecting and retargeting, and of any synergies between any two of those channels. The conditions of the experiment were determined by a factorial design.

In a factorial design, experimental conditions are assigned in such a way that the effect of each factor can be measured independently of the other factors. For example, consider an experiment examining only Facebook Prospecting and Google Prospecting. If Los Angeles, CA were assigned high advertising on both channels and New York, NY were assigned low advertising on both channels, we would expect to see an increase in sales in Los Angeles and a decrease in sales in New York, but we would not know which channel, Facebook or Google, was responsible for the difference. A factorial design might also assign Chicago, IL with high Google advertising and low Facebook advertising; it might then assign Philadelphia, PA with low Google advertising and high Facebook advertising. Based on the pattern of sales increases and decreases in these four regions, the experiment would indicate the ROAS produced by each channel.

The proposed experiment design is shown in Table 1. The numbers shown in the table represent the percentage of the region’s typical advertising budget. For example, regions assigned to condition 1 would have their advertising budgets reduced to 0 for the duration of the experiment. Regions assigned to condition 2 would conduct no Google Retargeting ads or Facebook Prospecting ads, and their ads for Google Prospecting and Facebook Retargeting would be operating at 50% of their usual levels.

Both of the retargeting channels employed four advertising levels: 0%, 50% 100%, and 150%. The prospecting channels employed only three levels: 0%, 50% and 100%. The original experiment design proposed utilizing all four advertising levels in the prospecting channel as well, but the marketing team believed that increasing the budget so high would degrade the quality of the advertising targets. Therefore, the 150% condition was removed from the Prospecting channels and the experiment design was re-done (to the design shown in Table 1).

The 19 experiment conditions were assigned to the 210 Nielsen DMAs in tranches. The 19 most populous DMAs were each randomly assigned to one of the 19 experiment conditions. The next 19 most populous DMAs were then each randomly assigned to one of the 19 experiment conditions, etc. This randomization ensured that each condition would be applied to a population of sufficient size to obtain reliable measurement. Twelve DMAs were excluded from the experiment because sales relative to population was excessively high. Those twelve DMAs would not provide a level comparison with other regions.

Resistance. As head of analytics, Scott Clark did not have the authority to implement the experiment without support from Nocturnal’s C-Suite. While the CEO was supportive of the experiment in concept, there was concern that if he heard too many voices of dissent, he might be convinced that the experiment was too risky. Several Nocturnal stakeholders seemed opposed to the experiment.

Perhaps the most dissident voice came from outside of Nocturnal. Nocturnal’s ad implementation was outsourced to an agency, and the agency had many reasons to be resistant. First, their pay was determined as a percentage of ad spend, so the dramatic decrease in advertising expenditures would result in a dramatic cut in their pay. Second, they misunderstood the nature of the experiment and were convinced that the employment of 19 conditions would be too granular and thus the experiment would not definitively identify the best condition. They were familiar with field experiments of the A/B test variety, so they naturally assumed the purpose of the field experiment was to find the best condition. They did not realize that the purpose of the experiment was to establish variation in advertising that could be used to determine the influence of advertising on sales.

Nocturnal also met with resistance from their client manager at Facebook. Implementing the field experiment would require creating new ad sets—the advertisement specifications that instruct where, when, and what advertisements would be shown on Facebook. Performing a hard reset on these ad specifications would cause Nocturnal to lose all of the AI enhancements their account had gained on Facebook. All advertising on Facebook is run through a number of algorithms. Some of the algorithms employ artificial intelligence, or AI, to learn the kinds of Facebook users who are more likely to respond to the advertiser’s ads. After spending years and many millions of dollars advertising on Facebook, Nocturnal was well understood by Facebook’s algorithms, so the algorithms could show ads more prominently to likely customers and less prominently to Facebook users who were less likely to be interested in Nocturnal. Running the field experiment would eliminate this benefit. This downside had no impact on the efficacy of the experiment, but it had the potential to decrease Nocturnal’s ad effectiveness. The CEO approved the experiment despite the potential loss of this benefit. (Research is still being conducted to determine the effect of these AI enhancements. Some researchers argue that the benefit is negligible, because the algorithms tend to focus on showing ads to people who would have converted even without ad exposure. See episodes 440 and 441 of the Freakonomics podcast.)

The marketing team, which was never thrilled about the idea of purposeful reduction in sales in the first place, forecasted dramatic drops in sales in response to the advertising decrease. But surprisingly, they exhibited relatively little resistance, perhaps because they knew the CEO had given his explicit endorsement for the experiment.

Implementation. The experiment was implemented on November 18, 2019 and concluded on January 10, 2020. During this period, Nocturnal spent just over $5.7 million on these four advertising channels, a 39% decrease in the spending that would have occurred in the absence of the field experiment.

Because advertising in these channels is determined stochastically, in real time, it is impossible to guarantee spending at the precise levels specified. A variety of factors, such as user activity volumes and competitors’ bidding behavior, can affect the advertising spending that occurs in any given region. For example, if the Lafayette, IN DMA were assigned $230 dollars in Google Prospecting ad spend during a particular week, Google might display ads worth much less or much more than $230 of advertising. Thus, creating a unique ad set for each DMA would not guarantee proper experiment implementation, in addition to being needlessly labor-intensive.

Instead, only two ad sets were implemented in the Prospecting channels and three ad sets were implemented in the Retargeting channels. In the Google Prospecting channel, all DMAs assigned to conditions 2, 5, 8, 11, 14, and 17 (conditions with 50% advertising levels for Google Prospecting) were pooled in the same ad set. The expected advertising in the Google Prospecting channel for all of these DMAs was calculated, and half of this number was budgeted to that ad set. In the second Google Prospecting ad set, DMAs assigned to conditions 3, 6, 9, 12, 15, 18, and 19 (conditions with 100% advertising levels for Google Prospecting) were pooled and were assigned the full expected advertising budget. (DMAs assigned to conditions 1, 4, 7, 10, 13, and 16 received no advertising on the Google Prospecting channel, and thus no ad set was implemented for these regions.) Pooling DMAs in this manner resulted in DMAs being assigned the appropriate advertising budget in expectation, but the realized advertising amounts deviated from the assigned ad budget. These deviations from the assigned budget would have resulted even with DMA-level ad sets, so this implementation method did not reduce the viability of the field experiment.

Table 2 shows the ratio of ad spending by channel in each condition during the field experiment compared with the ad spending prior to the field experiment. The correlation between the planned and realized advertising spending is greater than .98 for all four ad channels.

Results. During the experiment, sales in the New York DMA were higher than sales in Washington, DC, but this difference has more to do with population size than with advertising amounts. To measure ROAS, the analysis must account for differences in population size as well as other cross-region differences. The simplest method of controlling for these myriad regional differences is to use past results from the same region. The overall level of sales in New York is caused by advertising, New York’s population size, the brand awareness of Nocturnal in the region, the New York population’s willingness to purchase a mattress online, and other factors. Fortunately, all of these factors except advertising expenditures were identical during the experiment and before the experiment. If we calculate the ratio of during-experiment sales over pre-experiment sales, this ratio controls for all non-advertising influences on sales.

The analysis used was a rather simplistic ordinary-least-squares regression (the standard regression technique, often referred to as “linear regression”). The dependent variable in the regression was the ratio of during-experiment sales to pre-experiment sales. The independent variables were the same ratios calculated on each advertising channel. That is, the Google Prospecting variable was the ratio of during-experiment spending on Google Prospecting relative to pre-experiment spending on Google Prospecting. In other words, the independent variables were the values shown in Table 2.

The results of this regression are shown in Table 3. The regression coefficients of Google Prospecting and Facebook Prospecting are both statistically significant, while the coefficients of Google Retargeting and Facebook Retargeting are not. These results do not conclusively show that Retargeting has no causal effect on sales—it simply indicates that their effect on sales was not strong enough in this experiment to increase sales beyond what might have occurred from random noise. However, the increase in sales that occurred in regions with heavier levels of Prospecting ads was larger than what we can attribute to random noise. In other words, the results provide strong evidence that Prospecting ads produced a measurable, causal increase in sales.

Both the dependent variable (ratio of during- and pre-experiment sales) and the independent variables (ratio of during- and pre-experiment advertising) are expressed in percentage terms. Therefore, the regression coefficients represent advertising elasticities. An elasticity expresses the percentage change of one quantity that results from a percentage change in another quantity. For example, a price elasticity of 1 means that a 10% change in price will cause a 10% change in sales. The coefficient on Google Prospecting (.113) represents an advertising elasticity of .113, or 11.3%. This means that a 10% increase in advertising on this channel is expected to produce an increase in sales of 1.13%. That number may seem small, but it represents a rather large effect of advertising in both Prospecting channels. An increase of $10,000 to the Google Prospecting ad budget would be expected to boost sales by $30,000. Meanwhile, an increase of $10,000 to the Facebook Prospecting ad budget would be expected to boost sales by $11,400. The larger expected increase from Google Prospecting results partly from its larger elasticity and partly from its lower ad budget relative to Facebook Prospecting. (Because Facebook’s ad budget is higher than Google’s, a $10,000 increase in ad budget represents a smaller percentage increase in advertising and thus a smaller percentage increase in sales.)

The field experiment provides strong evidence of the causal role of advertising in generating sales (at least in the Prospecting channels), but we can examine its effect in more detail. Advertising can increase sales through one or both of two pathways: (1) it can generate purchases that would not have occurred in the absence of the ads, or (2) it can increase the amount spent by those who would have purchased anyway. Either pathway is plausible to Nocturnal. By informing people about its mattresses and convincing them of their sleep benefits, people who are exposed to Nocturnal’s ads might decide to buy a mattress from them, in keeping with the first pathway. On the other hand, people who are exposed to Nocturnal’s advertising might be convinced to upgrade their planned purchase to one of the more expensive models, or to purchase additional supplemental products (i.e., pillows and sheets).

To investigate these paths, two additional regression models are needed; one to examine the number of orders placed and another to examine the average monetary value of the orders. Tables 4 and 5 present the results of these regressions. Consistent with the regression model presented in Table 3, these regressions utilized the ratio of during- and pre-experiment orders or monetary values as the dependent variables.

The regression results shown in Table 4 show that advertising produced sales by generating purchases that would not have materialized in the absence of those ads. The regression results depicted in Table 5, on the other hand, show no evidence of any ad channel producing an increase in the monetary value of those orders. An interesting feature of these regression results is the stronger statistical evidence for the increase in orders as compared to the increase in sales. Because advertising increases sales solely by generating additional purchases, this dependent variable (number of orders placed) provides the most direct measure of advertising’s influence. Overall sales is a combination of orders and the order amount, and this combination of metrics makes sales a noisier measure to detect advertising’s influence.

An additional avenue of investigation of the field experiment was advertising synergies or saturation. Evidence of advertising synergies would be provided by a statistically significant positive interaction coefficient in one of the regression models. For example, if a regression model included an interaction term between Google Prospecting and Google Retargeting, and the coefficient of that interaction term were positive and statistically significant, it would indicate that advertising on the Google Retargeting channel produces sales more effectively when Google Prospecting spend is also high. An examination of all possible interaction terms showed no statistical significance on any of them, thus providing no evidence of advertising synergies.

Evidence of advertising saturation requires another analysis, specifically an analysis of non-linearity in the regression. If Nocturnal is advertising to the point of saturation, then the effect of changing advertising from 0% to 50% should be larger than the effect of changing advertising from 50% to 100%. Non-linearities are easily investigated through the addition of a quadratic term (or squared term) in the regression equation. None of the quadratic terms indicated positive evidence for advertising saturation.

It is important to recognize that the standard analytics reports from the advertising channels are unlikely to ever produce accurate attribution. Table 6 shows the analytics reporting for one of the regions assigned to condition 19 (which received full advertising budgets in all four channels). According to this analytics report, the advertising returns for retargeting were positive while the returns for prospecting were negative, which stands in stark contrast to the true contributions of these channels. Several factors produce this misleading report. First, many of the conversions produced by the prospecting ads are not credited to the original ad, because there is often a long delay between ad click and purchase. If the customer made the purchase more than one month after the original ad click and navigated to the website directly, the conversion would not be credited to the original ad, even if the table utilized first-click attribution. Second, if the customer clicked on a retargeting ad during the period between original ad click and purchase, the retargeting ad would be credited with the sale, even if that second visit to the website had no influence on the eventual purchase. A retargeting ad can be credited with the sale even when it doesn’t produce a click. For example, if a customer who intended to purchase a Nocturnal mattress happened to check Facebook just prior to purchasing, and a retargeting ad showed up on his feed, the retargeting ad would be credited with producing the sale (even if the customer didn’t click on the ad). Rather than persuading previous website visitors to more seriously consider a purchase of a Nocturnal mattress, retargeting ads are often just being shown to people who were already destined to make a purchase.

Inference. The advertising elasticity on Google Prospecting was higher than the elasticity on Facebook Prospecting, indicating perhaps that Google is the more cost-effective advertising channel. But the difference in size between those two regression coefficients is not statistically significant, so the experiment does not strongly indicate the superiority of Google ads over Facebook ads. The experiment does indicate that Nocturnal could stand to increase advertising on Google Prospecting and probably not on Facebook Prospecting. Increasing advertising on Google Prospecting is expected to increase sales much more than will a similar increase in advertising on Facebook Prospecting.

Another immediate change in Nocturnal’s advertising plan should likely be a dramatic reduction in Retargeting ad spend. The field experiment provided no indication that advertising on either Retargeting channel increased sales in any way. It is possible that all money spent on Retargeting is wasted and thus Nocturnal could save millions of dollars per year by cutting spending on these channels to 0. However, such a dramatic move would be risky. The field experiment failed to find a positive effect of Retargeting, but this absence of evidence should not be confused with having found positive evidence of no effect. It is possible, probable even, that Retargeting has a real but weaker effect on sales than Prospecting does. Reducing ads on the Retargeting channels to 0 would reduce sales if this is the case. Just how much should Nocturnal decrease their Retargeting ads? Answering this question requires some inference about why Retargeting was so much less effective than Prospecting.

The ineffectiveness of Retargeting goes counter to prevailing industry wisdom. In the paradigmatic display advertising customer conversion process, a potential customer sees a display ad (static, dynamic, or video) on the side of a website he is perusing or in his Facebook feed, and his interest in the advertised product or service causes him to click on the ad. He navigates to the ad’s landing page, where he sees additional information about the product or service, whereupon he either exits the website, clicks further into the website before exiting, or clicks further into the website and makes a purchase. A large portion of those who exit the website do so not because they are uninterested in making the purchase, but for a variety of reasons are not ready to purchase at the time of the first visit. Therefore, by targeting these prior website visitors with retargeting ads, so the thinking …

error: Content is protected !!