Online companies need flexible platforms to quickly test different product features and experiences. However, app-based development and strict API responses make iterative development difficult, because each change requires a new release, which is slow, costly, and not immediately available to consumers.
At DoorDash, we wanted to improve customer experiences with iterative feature development but ran into similar problems where we could not quickly and easily test new experiences. This is because our website was statically stitched by web and mobile clients that called multiple backend APIs and introduced separate business logic to handle the backend response. This setup slowed the development of new iterations and made the website and apps’ overall presentation inconsistent. In addition, we found it challenging to rank all the content for the homepage in an efficient and scalable manner within our Python-based monolith.
To address these problems we rebuilt our homepage inside of a new microservice called the feed service that is able to scale and would utilize Display Modules, essentially generic content blocks that power the layout and contents of the homepage. Using this backend-driven content to dynamically render our homepage provided greater flexibility in testing various hypotheses, understanding customer behavior, and quickly iterating based on experiment results.
The problem with our static legacy UI solution
At DoorDash we needed a more flexible solution to display our content which could also reliably handle our growing website traffic. Let’s break down our existing system and examine how it could not satisfy our flexibility and reliability needs.
Client-side logic requires an app release with every change
With traditional app-based development, the client app receives a backend server API response, performs any additional logic, and renders the page for the user. This was how DoorDash’s homepage worked: the client apps would call multiple backend APIs and stitch each response together while also conducting any necessary business logic.
Since we had three clients, iOS, Android, and web, each implemented its own application logic and delivered its own unique experience, leading to inconsistent user experiences across all three. For example, suppose the backend returns a field called DeliveryFee with a value of 0. An iOS client might interpret this value and present “Free Delivery” while Android might display “$0 Delivery”, which are two different user experiences. This inconsistency was not ideal since it could cause larger bugs and prevented us from having a single seamless experience across multiple clients.
In addition, merely changing “Free Delivery” to “$0 Delivery” and vice versa on clients requires an app store release, which can take up to two weeks. This makes the iterative process extremely slow, since we cannot release every day, and makes it difficult to A/B test different experiences quickly.
Ranking content is difficult with multiple static API response shapes
Another downfall of having to stitch multiple API response shapes by the client was the difficulty of rearranging the content on our pages. On our homepage, we had a different API for different types of content styles.
For example, we had one API that returned carousels and another that returned a list of restaurants. The client would always display the carousels above the list which prevented us from flexibly rearranging the different content types or even intermixing the different content types to see which performed best. We needed a way to organize the layout from the backend so that we could quickly test which layout provided the best experience for our customers.
Handling traffic is not scalable within a shared monolithic service
The majority of companies that start out with a monolithic backend reach a point where one monolith cannot support the growing load, so different functions start to get extracted out into individual services. DoorDash’s homepage backend response was being served by our monolith, and eventually our growth exceeded its capacity. Merely adding compute resources became less efficient and we needed to redesign the homepage inside a scalable, fault tolerant, and isolated microservice. This was not an easy task as there were still many dependencies inside the monolith, which needed a coordinated strategy to get right.
Building flexible backend-driven content at DoorDash
To address the above problems, we set out to power the homepage content entirely from the backend under a consolidated endpoint. We can think of the homepage API response as a list of content holders, which the backend can fill as needed and the clients can render according to the type of content holder. We named these content holders Display Modules. These Display Modules returned by a backend API are composed on top of each other to display the homepage, which is powered by a new microservice.
What is a Display Module?
In short, a Display Module is a generic content holder that supports multiple presentation styles such as lists or carousels. A Display Module also holds critical navigational information in the form of a cursor, which contains important context that we pass from page to page.
display_module: id version display_module_type = store_carousel || item_carousel || store_list sort_order [content]
After conceptualizing Display Modules, we extracted out the homepage by implementing the different types of content as specific Display Modules and powered the API response using the feed service microservice.
Extracting our homepage
The extraction of the homepage for the consumer app was a large undertaking because there was a lot of content to organize and display. Most of the underlying data is provided by another microservice, called the Search Service but all of the decoration, the visual elements, and response building was located in our monolith. We used this opportunity to approach the extraction from the ground up and were able to achieve the migration with these four steps:
- Enumerating the dependencies: Many parts of the homepage were still dependent on the monolith, and it became important to list these dependencies and determine a plan for each.
- Creating Display Modules: We took the existing content shapes and fit them into our Display Module concept, creating API request/response contracts.
- Implementing the feed service: We built this powerful, scalable, and reliable backend service to withstand peak loads and create feed-style response shapes using Display Modules.
- Integrating clients: Once the backend was complete, clients integrated with the new homepage API to render homepage content and layout driven by the backend.
Enumerating the dependencies
It is important to understand and list all the dependencies required to decorate the final response. Figure 1, below, shows some of the different functions required to decorate the homepage. There were no services powering these components at the time so we needed to come up with various stop-gap solutions to unblock our extraction effort.
We listed all of our dependencies and considered short-term versus long-term plans; it was important to scale out the homepage without being blocked by these dependencies. We also listed concrete plans for a long-term migration. As a short-term solution, we built APIs for the above dependencies within the monolith with loose coupling so that we could have a fallback whenever the monolith had any issues.
Creating Display Modules
When we were implementing this solution, there were two presentation styles used on the homepage that required adapting to Display Modules: store list and store carousel. Below is the Display Module representation of the two styles:
As we can see, both Display Modules were almost identical in terms of their shape, with each containing a list of store content to display. The main differences between the two Display Modules were
type, which dictated how to display the content, and
sort_order, which dictated where to display the content. The cursor was used to hold context for subsequent requests or different page requests.
As an example, the cursor for a
store_carousel would hold context about the stores presented that the server could pass in a subsequent request to expand the carousel into a list. The cursor was obfuscated to the clients as an encoded string, as seen in Figure 2, below, and serialized/deserialized entirely by the backend.
Before getting our hands dirty with writing code and building a service to power Display Modules, we had to define an API request and response shape that the clients and backend could agree to. We used the newly defined Display Modules for the homepage and translated them into protobufs. At DoorDash, we had standardized on gRPC as the main communication protocol for our microservices so defining protobufs for the homepage content provided us alignment across the stack.
Implementing the feed service
To power the homepage response, we needed a backend service that would be responsible for fetching the appropriate content and decorating it for display purposes.
As shown in Figure 3, the feed service serves as an entry point for a homepage request, and orchestrates the response by fetching data from different content providers, decorating this data via calls to various decorators and building display modules as a feed-style response before returning back to the clients. Using a dedicated backend service to curate the homepage provides us with a single-source consolidation of business logic and a consistent user experience across all clients.
How we integrated with the clients
After building the backend feed service, clients needed to be integrated to render the homepage response. We use a backend-for-frontend (BFF) for all client to server communication. The BFF is responsible for handling user authentication and acting as frontline for client routes. For the web application, which is written in Node.js, the integration with feed service became the first Node service at DoorDash to communicate with a gRPC service.
In general this was a very successful project, as we were able to build a solution that was both flexible and could handle our scale. Specifically, the new system allowed us to:
- Experiment with different types of carousels
- Enabled experimentation of algorithm-based carousels with “Your Favorites”, “Now on DoorDash”, and “Fastest near you”.
- A/B testing of manual and programmatic carousels allowed us to efficiently provide the best customer experience and realize incremental conversion gains.
- Improve the customer experience
- With backend-driven content, we achieved single-source consolidation so that each client does not need to apply any business logic after the API response is returned.
- Customers on mobile and web saw a consistent experience across their devices.
- Deliver personalized layouts with Display Modules
- With our backend service and Display Modules, we can quickly change and experiment the composition of the feed, giving us more power to understand which layout works best for our customers.
- We can also use cursor navigation to carry context from screen to screen.
In addition to the new capabilities above, our extraction of the feed service helped us improve
- Homepage latency, our most trafficked page, by roughly 20%.
- Reliability and scalability of the homepage by removing dependency of the monolith.
To sum up, we described a need to quickly experiment on different features and examined the problems within our legacy system that prevented us from doing so. By conceptualizing flexible Display Modules and building the feed service microservice to power the content from the backend, DoorDash was able to iterate on experiment results more quickly, leading to a better and consistent customer experience.
Although we achieved a great deal of flexibility with our Display Module paradigm, we plan to improve our frontend system further. There is scope for even more flexibility and opportunity for quicker iteration, but we have taken the first step and paved the way for a greater vision. The work described above lays the foundation for heavier personalization and strong consistency across a customer’s order flow. Next, we can use the feed service and Display Modules concept to power other pages such as the store page and the checkout page!
Thank you to Josh Zhu, Ashwin Kachhara, Satish Saley, Phil Kwon, Steven Liu, Nico Cserepy, Becca Stanger, and Eric Gu for their involvement and contribution to the project.