Intelligent order sequencing for a large, international coffeehouse chain

Intelligent order sequencing for a large, international coffeehouse chain

Executive Summary

  • A large, international coffee chain wanted to optimize the customer order prioritization between their mobile-app, in-store and drive-through channels to improve sales and customer satisfaction
  • The solution would need to balance the timing of orders based on a variety of factors such as the customer’s estimated arrival time, the orders in queue, and barista workflow.
  • Neal Analytics addressed these elements holistically and created a custom, machine learning algorithm that used real-time data to determine an order’s priority
  • The algorithm was first tested in both computer and lab store simulations
  • The solution is currently in trial process and is forecasted to impact revenues by over $200 million

Introduction

For many restaurants, fast food chains, and cafés in North America and around the world, mobile ordering is opening a new channel for orders beyond the drive-through and counter. This has led to substantial growth, but the sudden influx of “take out” style orders, where customers are not physically at the store waiting for their orders, presents new challenges in ensuring product quality and customer prioritization.

If the order is ready too early, and the customer or delivery driver has not arrived to collect it, the quality of the product may deteriorate quickly. This leads to decreased satisfaction, reorders, or customer churn. Obviously, similar penalties can occur if the order is late. On the other hand, over-prioritizing mobile orders above the in-store experience lead to additional wait times and decreased satisfaction. Therefore, we can maximize sales and customer satisfaction by balancing these considerations.

However, food production lines have limited capacity, whether it be through labor, specialized equipment, or simply physical space. At a café, for example, baristas must coordinate the production of drinks at different stations, each with their own production times, phases, and requirements. Espresso beans must be ground, the shot must be tamped, the shot is pulled while the milk is steamed, and then assembled into a cup. The workflow takes seconds, but it is easily disrupted. To maximize production, baristas must stay in their flow as much as possible. This adds another layer of complexity to the problem once other considerations are added.

In this case study, we will review how Neal Analytics analysts and data scientists broke down this salient business problem for a large international coffee chain and built the solution to solve it.

Scoping the Challenge

For over a century, “First Come, First Served” has been the operating paradigm under which practically every restaurant and service provider has used for production order sequencing. Sometimes an order takes less time and comes out first. Still, generally, it is accepted that the “fairest” option is one where orders begin production in the order they are received.

However, when an order comes in that clearly does not need to be produced immediately, what then? Mobile orders placed through a first- or third-party app for pickup create that situation. Now, instead of having to balance two channels of orders, production must balance three or more.

One challenge exacerbated by this is the limited visibility into how busy the store is at any time. It’s hard to know how long the line is at the counter or the drive-through. The data is expensive and hard to collect and use in real-time.

However, the key piece of information for mobile orders, the estimated time of arrival (ETA), is now more available than ever thanks to the ubiquity of smartphones and customer willingness to opt in to location sharing.

Take this example: A customer places an order that can be ready within 5 minutes. The GPS information provided by the customer’s mobile app indicates that they’re 15 minutes away. It makes sense then to delay that order by about 10 minutes so that it will be fresh when the customer arrives. Perhaps the business would want to hedge the risk by starting production at 8 minutes instead of 10, but that’s simply a parameter that can be tuned once the problem is solved.

Dimensions of Complexity

While at the surface this seems like a simple problem with a simple solution, diving into it reveals additional problems and layers of complexity. It’s like peeling an onion.

For example, how should mobile orders placed by customers in-stores be handled? These often occur when the store is busy, because savvy customers can effectively “skip the line” by placing their order on their phones and completing their transaction before those ahead of them in line. Not exactly fair, nor does it properly adhere to the “first come first served” (FCFS) principle. It also adds complexity from needing to determine what should be done with orders where the ETA is effectively zero. Do you delay the order some amount when the café is busy to increase fairness? How do you know the café is busy in the data?

Another topic is knowing the production time of items. It may take an average of 117 seconds to build a latte, but that in reality will vary greatly depending on who is working (how skilled or motivated they are), the environment (how inefficient production becomes with more bodies in the way), and if certain ingredients aren’t readily available and must be sourced from further away in the stockroom.

Or what about items that can likely be produced in parallel with others, with minimal differences in time? For example, two items could be placed in a warming oven at once, so the second item’s additional production time is effectively zero. Any algorithm that’s developed to solve these problems must be robust enough to handle such variances.

Operational Solutions

Before we dive into how this solution was developed, it’s worth taking some considerations into account. Namely, the way customer behavior can impact order sequencing.

One possible way to address these problems would be by driving certain behaviors out of customers in the mobile-order process. Other restaurants and cafes have tried this approach by adding a “schedule your order” feature in their app, allowing individuals to pick a time for their order to be ready.

This approach fails for a simple reason– people like to round. In practice, demand becomes lumped at 15 or 30-minute increments. This could be addressed with production slots, but at the cost of creating an undesirable user experience as the scheduling feature rapidly grows more complex.

That said, it is potentially beneficial to report back the expected time that each order will be ready, instead of a generic “Orders typically take about 15minutes at this time of day” message, to help set customer expectations. We identified this as a potential feature to test, and even going further that one could respond back with some constrained responses such as “I will be there five minutes earlier/later” to allow users a limited amount of ability to augment the ETA with potentially more accurate timing. While this was not implemented in the initial solution, if it shows improvement it will likely be released in a future iteration.

There are many potential ways to optimize sequencing, so it is important to maintain focus on the fundamental goals—improving the customer experience and reducing average wait time, wasted product, and congestion which are a key drag on efficiency and satisfaction.

Prioritizing Optimization Targets

To “double click” on these goals, we evaluated many business considerations to determine how we could best achieve order optimization.

In some cases, a certain channel of orders is simply worth more because the customers are more loyal, spend more (and more often), or cost less to serve and the business wants to drive customers toward that channel. A higher priority could be put on these orders to give these customers a better experience. This may be a violation of FCFS, but in a world of premium memberships and fast passes to skip the line, it could be worth testing to evaluate the ROI vs. the potential drawbacks.

Not all considerations are as transparently designed to place the business’s benefit over potential fairness. In some cases, strict FCFS logic could cause more drag on efficiency and customer satisfaction. For example, if a drive-through customer places a last-minute additional order when they are picking up their original order, it makes sense to prioritize that order to the top of the list. Not only does this provide a better customer experience, but the business benefits by avoiding a stand-still at the drive-through while orders stack up.

To approach this problem, the Neal Analytics team explored many potential optimization methods, including Queuing Theory, Decision Theory, Reinforcement Learning, and even Approximate Dynamic Programming. Such a holistic approach allowed the team to thoroughly understand the best method to start with and what will be best for more complex iterations.

The obvious optimization target is minimizing wait time, but that target may not always be the most important factor for good customer experiences. Certain times of the day, certain channels, or geographic and urban/rural dimensions may prioritize other aspects, such as the face time with baristas, quality of the product, or getting everything in order ready to serve all at once.

Minimizing wait time as an initial target makes sense. But, following Neal’s recommendation, the team decided that the more advanced iteration of the algorithm would require a meta-algorithm layer. This algorithm would tune sequencing further around these considerations and potentially allow user (i.e. baristas) inputs or selection.

Identifying the KPIs & Data

As mentioned initially, good data is key to a robust solution. To this end, we needed to ensure that the solution had the data it needed and that we could accurately measure improvements to performance.

We identified several datasets that were relevant. It turned out, the biggest data challenge was not location data, but production time data. Location data and computing ETA from one of the mapping services is relatively straightforward and, while not always accurate, we could build a buffer to hedge against inaccuracy. However, the lab standard times for production were neither at the requisite granularity nor were they able to demonstrate the amount each item production time was likely to vary. Some items are more consistent than others, but an average time is simply an average. It does not represent on its own how things happen in real life. The Neal Analytics team ended up using several methods, including surveying actual cafés incognito to collect real-world data.

To measure performance, Neal worked with the client to identify several key measures to track:

  • Number of items on the counter (increases efficiency, as having too many items on the counter makes it hard to find each order)
  • Time spent on the counter waiting for handoff (reduces wastage)
  • Total number of customers waiting (lower congestion)
  • Average wait time (improves overall customer experience)

Simulations to Test

Building the solution required substantial testing, which we completed with both a full digital twin simulation of a café using Python, as well as live simulations in café lab trials.

We tested hundreds of algorithm configurations, iterating many times to observe the movement of each KPI and reviewing with the customer’s stakeholders to find the optimum balance which met all their criteria. This included visual representations of the sequence output to build a solid story and a clear understanding of the options, each with their benefits and drawbacks.

Operationalizing the Algorithm

Of course, a custom algorithm such as this implemented at scale while handling real-time data streams from the cloud and local sources would need to be written in a production code format. So the final optimized algorithm generated a series of rules and parameters that will be implemented in C# code.

The code runs locally in stores. While the algorithm does receive a variety of data from various web APIs, the trained model could not be dependent on a network connection. Therefore, it needed to be able to fall back to a simpler FCFS logic for several edge use cases or potential errors.

This made the solution robust enough to handle the real-world variety that comes from having thousands of locations open every day.

Evaluating the Impact

While the improvements driven by this solution forecasted a greater than $200 million dollars worldwide impact on their business, putting this into production without proper testing would not be a sound approach. So, the refined algorithm is now in a trial process and is being rolled out to larger parts of the business as each performance validation test has been passed.

We look forward to continuing our partnership with this customer by iterating on additional improvements that enhance the customer experience and reduce wait times with intelligent order sequencing.