One challenge in running our platform is being able to accurately track Merchants' operational status and ability to receive and fulfill orders. For example, when a Merchant’s location is physically closed but marked as open on our platform, we might create a bad experience for all of our users; a Dasher cannot complete their accepted delivery, the Consumer cannot receive their ordered food, and the Merchant could see lower future revenues. Similarly, when a Merchant who is open but marked as closed on our platform results in similarly negative outcomes; Consumers can not make an order, the Merchant loses potential revenue, and Dashers lose potential delivery opportunities as well.
This post will highlight how we used machine learning to predict the operational status of a store and deliver the best possible experience for Merchants, Dashers, and Customers.
The challenge of having accurate store operational hours
On the DoorDash marketplace stores operate independently, which means that we do not always get the most up-to-date information on merchant’s business hours and operational status. Not having accurate operational hours is an acute problem when merchants are dealing with staffing and supply-chain shortages. Therefore, in a small percentage of deliveries, the Dasher might find that the store is actually closed. With tens of millions of deliveries being fulfilled each week, quickly and efficiently confirming such merchant closures and reacting to them is key for these reasons:
- To allow consumers to quickly be informed of the issue and get refunded.
- To make sure that no further orders can be placed at a closed store.
However, leveraging our support team to act on these Dasher-reported closures is both costly and inefficient, given the thousands of closed store cases that are reported each day. To help Dashers quickly resolve closed merchant issues with maximum efficiency, we built a “Dasher reports store closed” feature [DRSC for short] directly in the Dasher app. In the rest of the article, we will walk through how this feature works and how we augmented it with ML to improve its performance and automation.
How the DRSC works
When Dashers find themselves unable to pick up an order at a store location that appears closed, they are prompted to take a picture of the store to kick off the reporting process. When a valid picture is uploaded, the Dasher is compensated for the partial time and effort they spent getting to the store, and are unassigned from the delivery so that they can continue their dash and be assigned other deliveries.
When a DRSC report is received, a set of actions can be taken on the order: we could either cancel the delivery and reimburse the customer, or alternatively when we have reason to doubt the report’s accuracy we could reassign the order to a new Dasher to re-attempt the pickup.
In parallel, we need to contact the merchant to confirm that the store is indeed closed, so we can pause it on the platform. If the merchant confirms the closure or is unresponsive, we will pause the store for a set period of time. Pausing the merchant prevents consumers and Dashers from unnecessarily experiencing another similar issue when we already have some signal information that the store is closed. If the Merchant confirms the store is open, then the report is rejected, we find a different Dasher to fulfill this order, and the merchant can continue receiving other orders.
The existence of inaccurate reports, though infrequent, is a major challenge for DRSC that we set out to minimize using ML.
Why we went the ML route
Without carefully examining each DRSC report, we might unnecessarily cancel orders or pause merchants. We needed an ML-powered solution to automate this review process at scale. Since some DRSC reports are inaccurate, either because the image does not show a closed store or the merchant confirms they are in fact open, having a trusted means of reviewing the DRSC validities meant we would be better equipped to reassign another Dasher and complete the order, when the validity of the DRSC report is in question.
A validation step that can categorize reports would need to be accurate and fast, and scale up to thousands of daily reports. Humans could do it, but it would be expensive, time-consuming, and unscalable. A heuristic set of rules could handle the required speed and scale but would be only moderately accurate, with errors in passing inaccurate reports being costly to merchants and painful for customers. Given the inherent uncertainty of the problem, a conditional probability model - a mathematical formula for assigning probabilities to outcomes given varying information - seems especially fit for the job. This is because conditional probability models can pool available information about a store and a Dasher to output inferences about the store status that help us make optimal decisions.
Subscribe for weekly updates
How the ML model replaced the heuristic
We started with the idea of wanting to calculate the probability that a given store is closed when a DRSC report is filed,
Probability(Store is Closed | DRSC report).
While a more general model for inferring store status could be built, we constrained ourselves to only handling the DRSC use case.
Our first challenge was that the dependent variable (the status of a store) is unknown and would be prohibitively expensive to measure (i.e we do not actually know if a store is closed or not). To construct the store status variable, we looked at past DRSC reports and checked if such stores were fulfilling orders or responding to our messages about the store status in the hour following a DRSC report. For example, when a Dasher is able to complete a pick up within 15 minutes of a historical DRSC report, we would infer that the store was probably still open, despite the report.
A DRSC report provides basic features like the store Id, Dasher Id, timestamp of the report, as well as an image of the storefront taken by the Dasher. Store and Dasher Ids are useful for connecting a DRSC report to a history of deliveries at the store and a history of deliveries across stores by the Dasher. Historical data allows us to build features like the number of recent reports for a given store, the last time a pickup occurred, and the number of invalid store-closed reports for a given Dasher over multiple months.
Additionally, an image of the storefront might contain valuable information about its status by capturing a sign on the door indicating that it’s closed or showing that lights are off. By converting images into a summary signal such as the probability that a ‘store is closed’ or a ‘store is open’ we can process and use hundreds of thousands of images quickly. We accomplished this by training an image classifier using our internal image processing platform. The image classifier compresses the image information into a single number which we then use as one of our model features.
Finally, a single LightGBM model can combine historical and image information to compute a probability that a store is closed. The probability sets us up to make decisions. Today, we use thresholds to partition the 0-1 probability set into intervals that map onto our actions:
- low probability of a store being closed leads to unassign the order and find a new Dasher
- intermediate probabilities would lead us to cancel the order
- high probabilities would lead us to both cancel the order and pause the store.
Following the deployment of this ML model and an AB experiment, we confirmed that the improved decision-making now saves thousands of deliveries from being canceled every week. Avoiding cancellations produces a better consumer experience, increased revenue for our partner merchants, more earning opportunities for Dashers, and a more robust business for DoorDash. This initiative is an example of how automated statistical decision-making can be used to build intelligent logistics infrastructure and provide value for all participants in the marketplace.
Next step: building a dynamic loss function
The next major iteration is making the decision thresholds dynamic. By incorporating time of day, future store volume, and potential future cancellations, we could fine-tune our decision-making when a DRSC report is received. We could then construct a loss function that outputs for each action and store-status a cost we will pay. A loss function and a probability model can be used to compute expected loss and find an action that minimizes it. Effectively, we will have implicit dynamic thresholds that adjust to time and conditions for each store. Being sensitive to cost for each Dasher and Merchant will make our decisions even more effective.
Many companies have built their operations on heuristics or simple rules. Rules are the quickest route to prove that a problem is solvable by moving from nothing to functional performance. However, there can be a large opportunity cost in not considering upgrading the heuristic solution into an optimized ML solution that will boost the performance to a near-maximum level. Using simple and low-effort ML tools like LightGBM is a great way to start your journey from simple rules to intelligent infrastructure and maximize efficiency.