The film’s success was another win for Netflix’s strategy of using its recommendation system to offer a wide range of content to potentially interested viewers. At best, movies and shows like K-Pop Demon Hunters, toy squid, or Stranger Things become must-see entertainment, driving a wave of new subscriptions.
But it’s hard to say what made these particular movies and shows click with viewers. Was it the charm of the content itself? Or was it the boost provided by Netflix’s personalization algorithm?
It’s a tough question – such a question Guy Aridorassistant professor of marketing at Kellogg, works to contact Netflix co-authors Kevin Zielnicki, Aurelien Bibaut, Allen Tran, Winston Chou, and Nathan Kallus. In a working paper, they present a model that decouples the influence of the platform’s recommendation service from the underlying value of the content itself.
In addition to helping Netflix determine how many additional viewers different shows and movies are attracting, the model also offers a data-driven perspective on what types of content are most helpful to recommendation systems.
“What we’re finding is that mid-tier titles that are moderately popular, that have very strong positions—those are the ones that really benefit from recommendations,” says Aridor.
The power of recommendation
Netflix regularly boasts that it has the largest streaming catalog in the industry. But that volume isn’t as valuable if subscribers can’t find what they want to watch.
“You can’t produce and get a lot of titles and make them stick unless you can target them effectively,” says Aridor. “A lot of titles, if they’re not recommended to the right people, probably won’t get a lot of viewers.”
So Netflix is investing heavily in its recommendation system. In 2006 they offered $1 million reward in the first group that could improve its recommendations by 10 percent. A team of computer scientists beat it in 2009 by adapting a mathematical approach called matrix factorization.
Netflix didn’t stop there. Using data from its over 300 million subscribers, the company has continuously improved his system so that each user sees a personalized content menu every time they open the app.
But that success is complicating other parts of Netflix’s business. It is difficult to measure the viewership determined by this targeting against the quality of the content itself. Do people just watch a show because Netflix recommended it, or would they seek it out regardless of the algorithm?
The subscription-based model further muddies the water. Competitors like Amazon and Apple can measure the value of a given movie based on rentals or sales at different prices. But Netflix customers pay a flat fee to access the entire catalog, so the company can’t easily judge the replacement value of each title.
Measuring value in the attention economy
Aridor and colleagues tackled this problem by creating a new model for content evaluation. The model enables them to simulate hypothetical scenarios that allow Netflix to ask the questions it cares about.
For example, if the current recommendation system were replaced with a random selection of content or just the most popular shows and movies, how would user engagement change? If a viewer who would probably want to Emily in Paris they didn’t find it, what would they choose to see? Would they just disable the app and do something else?
The researchers were able to create this model because of intentional amounts of randomization in the Netflix algorithm. In order to better learn user preferences, the company regularly runs subtle experiments that randomize different elements of the Netflix home page. The researchers used what viewers choose to do under these different conditions to train and validate their model.
“This is a general problem in the so-called attention economy,” says Aridor. “There is little or no price variation, and so by causing variation in the total number of securities displayed, this teaches us something about how to substitute between products.”
Surface of the middle tier
Once built, researchers could use the model to measure incremental viewership—how much each individual show or movie increases Netflix engagement. They could even estimate the value of the recommender system itself, compared to other algorithms.
As expected, Netflix’s current nominee outperformed alternatives such as random suggestions, showing only the most popular content and a matrix factorization system similar to the 2009 Netflix Award winner.
But the current system also performed better on another measure that Netflix values: increasing content diversity, or the overall variety of shows and movies users watch.
“Research from other streaming platforms shows that more diversified consumption is strongly associated with good long-term outcomes in terms of consumer satisfaction,” says Aridor. “So it’s important that the recommendation system doesn’t just encourage people to watch the same types of titles.”
The model also revealed that proven hits like Emily in Paris and Stranger Things they don’t need much additional promotion and that obscure shows and movies don’t connect outside of a very specific audience. Instead, it’s the shows and movies in between that benefit from the recommendation system.
“That’s really where the bread and butter of the recommendation system and the Netflix business model go hand in hand, which is that you can only really have that kind of deep catalog if I can effectively match those viewers to those titles,” Aridor says.
Cultural influences
Beyond its utility for Netflix, the findings contribute to a broader debate about media and recommendation systems: Do they force audiences to cluster around a smaller segment of entertainment, or do they highlight art that would otherwise be overlooked?
“Maybe older recommender systems like the ones from 10 years ago that were deployed on these types of platforms had this clustering problem,” says Aridor. “But in terms of diversity, the current system is doing much better.”
And while the researchers’ new model may help Netflix make decisions about what content to add, it’s unlikely to capture the organic, off-platform factors that turn a consistent performer into the next cultural juggernaut.
“When a medium-sized title is recommended to the right set of people, then, sometimes, that audience is big enough to talk about it online and have conversations with each other, and then it snowballs,” says Aridor. “But it’s very rare and very difficult to predict.”
