At their best, self-learning algorithms can be "Win-Win-Win"

Lyft has always used an algorithm to match drivers and passengers, so they thought they could tweak it to make their plan work for Covid. But it turned out to be much more difficult than expected. “It showed the limit of the system,” he says Swallowwho is now an assistant professor of Operations at the Kellogg School.

The main issue, Martin explains, is that simple algorithms — like matching the nearest driver to a passenger — don’t actually work that well.

It got Martin thinking about how the matching algorithm could be improved, even after shares have recovered from the pandemic. What if the algorithm could teach itself how to better allocate drivers and then make those adjustments in real time?

He and a team from Lyft have accomplished just that. It took more than a year—an eternity at a tech company, Martin says—to create an algorithm that could engage in “reinforcement learning.” And while designing the algorithm was difficult, so was getting company-wide buy-in to attempt it.

After all, with reinforcement learning, “you give away a lot of control,” says Martin. “A machine that can make decisions without telling you? Imagine if you’re making those decisions about work, that’s your bread and butter.”

But the results were worth it: The company started making more money, drivers got more work, and passengers gave more five-star reviews. In addition, their project was named one of six finalists last month for the Franz Edelman Award, the most prestigious award in the field of analytics and business research. If you’ve taken a Lyft in the last year or two, then this algorithm has helped match you with a driver, and data from your trip has in turn helped improve the algorithm.

In the context of the growing concern about self-learning algorithms (consider ChatGPT), Lyft’s story shows that some of these tools really do improve everyone’s lives, Martin says.

“It’s not always a zero-sum game,” of contests between winners and losers, he says. “Passengers are happier. Drivers are busier. The platform earns more money. There is literally no downside.”

Because the closest is not always the best

For most people, especially those of us who have had to stand on a rainy corner waiting for a rideshare, it seems logical that dispatching the nearest driver makes the most sense. But this is not always the case.

The issue is when it’s busy and drivers are limited, Martin explains. When this happens, the driver closest to a passenger can be quite far away. If you send this driver, he will spend a lot of time “driving empty” and the passenger is stuck waiting for a long time and may even cancel his ride while the driver is on the way. And, crucially, it means that any new passengers trying to get a ride will have to wait even longer because available drivers spend so much of their time trying to get to their next fare, which means there are fewer and fewer drivers available. they carry people around.

“It’s like a death spiral for platforms,” says Martin.

The ideal solution, then, would be a matching algorithm that could predict what the situation will be like in the next few minutes. Will a new, closer passenger appear? Will the traffic on a certain road clear up making driving faster? If the driver picks someone up, will there be another passenger near the destination that will make the next transfer more efficient?

Essentially, the algorithm should be able to predict what will happen next. And that’s what Martin and the Lyft team were able to teach it to do.

They did this by focusing on the “value” of available drivers at any given time, with that value being an estimate of how much money the driver will earn while working that day. They then trained their algorithm to constantly analyze what was happening in real time in order to train itself to predict what was most likely to happen next.

It’s similar to reinforcement learning algorithms that play chess, Martin says. They are trained on millions and millions of real chess games and then can use that knowledge to predict their opponents next move.

The team tested their algorithm by creating experimental times, where Lyft matched drivers and passengers using the reinforcement-learning algorithm, and test times, where the matching was done by Lyft’s regular algorithm.

After more than a year of optimization, they came up with a new algorithm that outperformed the old one on all important measures. It produced more than $30 million a year in increased revenue for the company, along with a corresponding increase in driver earnings. Passengers were 3 percent less likely to cancel a ride request and there were 13 percent fewer ride requests that resulted in no driver being available. At the same time, five-star passenger reviews also increased.

“There weren’t more people using Lyft,” Martin says. “The improvement comes from drivers being better used.”

Beyond math

Their success is the first documented case of a rideshare company using reinforcement learning. But designing the algorithm wasn’t the only hard part.

“More important than the math is how you do this within the company,” says Martin.

Reinforcement learning means that the people involved don’t always know what’s going on. This becomes difficult for an organization in a number of ways, says Martin. For example, let’s say the team working on pricing wants to run their own experiment. They would like all other factors at that time to be held constant so they can make sense of their data. But if a matching algorithm changes things on its own at the same time, it’s hard to know how to interpret the data from the pricing experiment.

“It makes a lot of other things much more complicated,” says Martin.

Additionally, it makes it difficult for the team working on the algorithm to understand how to keep innovating. “If people lose track of what’s going on, how can they continue to innovate?” Martin asks. He is working with a PhD student, Yudi Huang, who is currently working with Lyft on exactly this question.

Moreover, at Lyft, the development of this algorithm took more than a year. “One year is a long time for a technology company. Two months is a long time! It’s very rare to spend a year on something that doesn’t work for that long,” he says.

Eventually, the team kept their spirits up and managed to convince the rest of the company to let them continue experimenting. There was no high-tech strategy for this, he says. “It’s the same way you do things anywhere,” he says. “You’re talking to the right people. You gain people’s trust. You build a team that is excited and then you show that it works. It is common in research to think that the idea itself is enough. But in an organization, it’s the process that leads to something happening.”

The fact that, at least in this case, the process resulted in a “win–win–win” situation is particularly exciting for Martin.

Each time the team tested a revised algorithm, they monitored a dashboard of important metrics that would turn red if the experiment was worse than the status quo and green if it was better.

The day they landed on the winning algorithm, “the screen was just green,” he says. “That’s really what optimization in operations is all about: finding that completely green thing.”

What's Hot

The Illinois crypto tax could tax you even if you lose money

Why Starbucks is betting on employees as creators

Iran overplayed its hand. Now Trump has to finish the job

At their best, self-learning algorithms can be “Win-Win-Win”

ESG risks can also lurk in supply chains

How do those Valentine’s Day roses end up in your bouquet? It’s complicated.

The vicious cycle of long waiting years

Do you really need all this data?

Leave A Reply Cancel Reply

How to Replace a 6-Figure Job You Hate With a Life That You Love

How To Build An Investment Portfolio For Retirement

What you thought you knew is hurting your money

What qualifies as an eligible HSA expense?