But could managers use A/B testing in the service of something very different – like designing the best way to motivate their employees?
That’s what two researchers at the Kellogg School of Management believe.
In a new study, George Georgiadis and Michael Powell, both associate professors of strategy, develop a model that shows how organizations can use A/B testing to find more effective ways to structure performance incentives. They determine that even a single A/B test can provide a surprising amount of information about how employees will respond to a range of motivation strategies. And they offer a framework for using A/B testing data to maximum advantage.
“We want to understand: If you have this kind of data, how can you use it to improve your employee incentive plans? How far can you go with a simple A/B test?” says Georgiadis. “In the first place, to get the ‘optimal’ incentive design in place, you would need an infinite number of experiments. What we’re arguing is that with an experiment, you can really go a long way.”
This is important, he explains, because employers are understandably very reluctant to experiment with incentive programs, as they don’t want to risk upsetting employees.
“If we’re talking about changing the way we pay people, we don’t want to do a lot of these experiments,” Powell says. “If you’re working on a website and you’re trying to figure out what color to make your button, it’s a lot easier to do a lot more testing.”
The right motivations
Organizations rely on a wide range of incentive programs to motivate their employees to work hard.
Some programs are pretty basic: think of an employee who receives a base salary plus a bonus if a certain sales goal is reached, or a transcriber who is paid based on the number of documents completed. Other systems are much more complex and may include tools such as profit sharing or restricted stock.
But all of these involve critical decisions — with critical trade-offs. For example, should this bonus be easily obtainable but modest? Or hard to get, but very lucrative? Some employees may find the latter option more motivating. others, just disappointing. And what about the basic salary? Too high relative to the bonus and can encourage complacency. too low and workers who prefer stability may balk.
Moreover, depending on the nature of the work, as well as the individual preferences of the employees, an incentive system that works well in one organization may fail miserably in another. This means that, practically speaking, one of the only ways for managers to know whether there is a better incentive plan for their organization is to modify their existing scheme for a limited time—perhaps only in one part of the organization—and then see what actually happens with the performance.
So Georgiadis and Powell set out to determine how much employers could learn from a single adjustment.
The researchers created a mathematical model to analyze the interactions between an employer and its employees. The employer has an existing incentive system and collects data on how productive their employees are under that incentive system. The employer then modifies the incentive system in some arbitrary way—perhaps by lowering the threshold for receiving a bonus or increasing piece-rate pay—for some or all of its employees, and collects data on how productive the employees are based on that. contract.
The researchers then investigated how well the data generated from the previous A/B test could be used to create a new—and more effective—incentive contract.
“Let’s say we find a way to make employees work a little harder” under a new set of contract terms, Powell says. “We can see that, on average, this change in compensation increased output or productivity by a certain amount. But it turns out that there is a lot more information contained in this experiment. That is, we know what happened not only to the average production, but what happened to the probability of low production and high production. This is very informative.”
Importantly, an employer can use the data from the distribution of employee responses to predict how employee productivity—and by extension, the employer’s profits—will change given Any change in contract.
How come? For example, by looking at the distribution of outputs in response to the two contracts, employers can learn whether the increase in average output is due to workers being less likely to relax compared to more likely to work hard. The difference sounds subtle, but it’s actually quite strong.
If workers who relax less will increase productivity in a particular environment, “then what that tells us is that we would want to penalize low performance,” Powell says. So, for example, employers could pay workers less per job if they performed an unusually low number of jobs. Or employers could offer a very low base salary with the possibility of earning a bonus if employees are moderately, but acceptably, productive.
On the other hand, if workers who work harder increase productivity in a particular environment, this suggests that employers should “pay people more when high performance is achieved,” Powell says.
In practice, this could mean paying workers more per job if they perform an ambitiously high number of tasks, or offering an average base salary with the possibility of earning bonuses only if workers are highly productive.
Put it to the test
To test the accuracy of their model using productivity data generated from real participants, the researchers turned to previously published data from participants who completed a simple online task under six different payment schemes.
They wanted to understand how well their model could use real performance data from any two payment systems to predict the performance of participants in another, completely different scheme.
The model was able to predict performance on other incentive contracts with a high degree of accuracy. “On average, the gap between predicted and actual productivity is just under 2 percent, says Georgiadis. “Mike and I were amazed at how accurate the predictions are.”
The researchers also used real productivity data to test their model’s ability to design a better contract. They asked: How close would this contract come to being optimal?
They found that, on average, using data from any two contracts would allow an employer to construct a third contract that earned just over two-thirds of the profits it would have if it could design a truly optimal contract.
“You don’t make the ‘optimal’ contract because you don’t have all the information,” says Georgiadis. However, “in this online experiment, a single A/B test can get you two-thirds of the way to optimality.”
Benefits of A/B testing
Powell and Georgiadis’ framework has a number of advantages that make it practical for organizations to use. First, unlike much previous economic research on motivation, it does not require the employer to understand anything about their employees’ preferences in advance, such as how much they dislike working at a faster pace. It also doesn’t require them to fully understand how much effort goes into being more productive in a given work environment.
“What we’re arguing is that if you’re willing to do an A/B test, you don’t need to know that much,” Powell says. “You just watch how they react.”
Their approach can be applied to organizations of different sizes, although organizations that can run a larger experiment, which will generate more data points, will learn more from their test.
Another plus is that the researchers’ article includes all the steps an organization would need to take to actually use the data from an A/B test to create a near-optimal incentive scheme. That in itself is key, because the process is hardly obvious.
The researchers point out that this work on A/B testing was originally inspired by students in their organizational strategy class. “I used to teach basic principles of motivation theory and we always got the question, ‘Well, literally, what should I do? My parents own a factory and their workers are piece workers. How should we change the price of the piece?’ And the existing tools weren’t well suited to answer that question,” says Powell.
This tool is. And it even has one last advantage: familiarity.
“Companies today use experimentation for a variety of purposes,” says Georgiadis. “We find that it can be very useful for incentive design as well.”