Creating a better customer experience often requires flexible software that can be updated to accommodate new situations without requiring developers to be involved. But there aren’t many off-the-shelf solutions that provide such a platform. DoorDash has faced this challenge repeatedly as we have worked to create an excellent experience for consumers, Dashers and merchants. Our rapid growth has meant that multiple new issues arise regularly. It can be time-consuming to rely exclusively on engineering teams to build solutions for new problem types as they are identified. To move nimbly to address customer problems, we knew we had to find a better way.
Our solution would need to allow customer experience teams to address new challenges directly. That meant allowing our customer experience operators to build customized workflows to resolve problems without relying on engineering. Building this kind of flexible platform posed an engineering challenge; we realized we would need to build a no-code solution and develop tools for our operators to create and deploy business workflows to a live production environment. Lacking non-proprietary solutions to achieve this objective and wanting to avoid vendor lock-in, we decided to build our own no-code solution using open source technologies such as Kotlin, Postgres, and React.
Key challenges to support a rapidly changing landscape
An effective no-code workflow platform needs to be flexible enough to interact with customer experience team members or consumers, allowing users to select choices to drive the next step in the workflow execution; for example, consumers should be able to choose whether they want to get a refund, credit or redelivery. The platform must allow execution of automated processes, such as clearing the way for customer experience team members to determine the eligibility of a customer’s request for a refund. And the platform’s user interface needed to allow operators to build a new workflow and roll it out seamlessly into a live production environment.
Deciding whether to build vs. buy
We reviewed a number of tools already on the market, but ultimately concluded that we would need to build a solution ourselves. Because DoorDash’s business operates in a dynamic environment, there are relentless uncertainties and constantly changing business priorities to be addressed. None of the existing solutions we reviewed were flexible enough to meet our requirements. Additionally, we were concerned that choosing a commercial solution would lock us in with a vendor who might not be able to adapt quickly enough to DoorDash’s fast-changing business needs.
Instead of buying a solution off the market, we decided to leverage open source technologies such as Kotlin, React, and Postgres that we could support ourselves to achieve DoorDash’s technology vision.
Creating DoorDash’s Workflow Studio and Execution Engine
We needed to build two sets of tools, each intended for different audiences:
Workflow Execution Engine allows our customer experience team members and consumers to interact with the workflow and execute steps, such as enabling consumers to choose the reason for a cancellation from a drop-down menu. The UI is built in multiple channels – web, iOS, and Android – to serve both customer experience team members and consumers. We used Kotlin to build the backend API; the workflow configuration is fetched from a Postgres database and deserialized into an object format that represents a decision tree. As shown in Figure 2, we execute automated steps repeatedly in a loop, responding back to the UI if a manual step is encountered.
Instead of trying to complete the entire project at once, we opted for an iterative approach with each iteration building on top of the existing platform. This allowed us to improve our platform incrementally without delaying its functional rollout.
The initial MVP solution focused on enabling workflows that consisted of manual steps. A manual step contains an instruction that a customer experience team member can follow to resolve a customer problem. This solution allowed our operators to convert knowledge articles – guides used by customer experience team members to determine the steps to troubleshoot a customer problem – into workflows.
Our second iteration automated the manual workflows from the first iteration, introducing automated steps, such as processing refunds, that can be chained together to create customized workflows.
In our third iteration, we added internationalization support for our workflows. Because the workflow platform’s core principle focused on developing a self-serve model, we added the capability for operators to configure steps with multiple language support.
Finally, our fourth iteration focused on enabling workflows for self-help customers who use the consumer web, iOS, or Android applications as well as Dashers who use the Dasher iOS or Android applications.
Subscribe for weekly updates
Resolving workflow execution challenges
Workflow solutions present unique challenges that are not typically seen when building an API solution. Rather than allowing two applications to interface, workflow configurations are represented as a decision tree that starts with a finite state and can end in one of many terminal states.
Dealing with unexpected failures
Problems cropped up when a workflow execution failed after executing some states without completing the process as expected. Because it was counterproductive to reverse all the transactions that had been completed, we decided to stop workflow execution when there is a serious error but to continue execution if the error won’t significantly affect the expected outcome.
Another issue revolved around how to maintain state so that the multiple steps executed could be grouped into a single workflow instance. The resolution involves passing a “request id” attribute back and forth between the client and the backend to correlate the steps of a single workflow execution. To reduce complexity, we decided not to maintain state in the backend. Instead, every automated step is executed independently and does not depend on the workflow execution context. This allows us to build automated steps that can be plugged into any workflow at any step.
Dealing with scale
We also needed to prevent workflows from getting too big or having the same set of steps repeated across workflows. To resolve this, we built a “workflow jump” feature that lets operators configure smaller workflows that can be chained together into a larger workflow.
Our journey toward creating a custom no-code workflow platform sheds light on open-source tools and techniques to create build-it-yourself workflow solutions without vendor components, leveraging a general technical stack that is used in day-to-day software development. Our no-code workflow platform creates a reusable pattern to solve a wide variety of problems at DoorDash. Now we are leveraging the platform to solve customer problems across multiple channels – the web, iOS, Android, Chat, and IVR phone automation. The technology can serve the needs of a variety of audiences, including operators, customer experience team members, consumers, Dashers, and merchants.
Thanks to JR Maitre, Abin Varghese, Bhaavyaa Kapoor, Pnina Eliyahu, Kevin Nguyen, Ashish Keshri, Han Yan, Dan Behar, Kumaril Dave and Han Huang for their contributions to this project.
Why not leverage something like Temporal, Cadence, Step Functions to build this instead of rolling a distributed state machine from scratch?