Implementing REST APIs with Embedded Privacy

author:

Implementing REST APIs with Embedded Privacy

Here at DoorDash, we’re tackling the problem of real-time delivery by integrating all players involved into our logistics platform. By controlling the entire stack of the delivery process, we believe we’re positioned to provide the most consistent logistics. In order to provide integrated logistics for our restaurant food delivery service, we’ve had to build out 3 different user facing products: for customers (to order food), drivers (to deliver the food), and restaurants (to make the food).

What does that have to do with APIs? Well, having to build a back-end that communicates well with three different kinds of users simultaneously requires a disciplined approach to implementing our internal APIs.

I wanted to share a few lessons I’ve learned about how our engineering team at DoorDash has handled these challenges. We’re using Django as our web framework and Django REST framework to build out our internal APIs (mainly because its browseable web API makes testing so easy!). However, the principles laid out here are generalizable beyond our specific programming language choices.

Not tryna read the whole thing? We got you covered: tl;dr

Our First Approach

Since the actions our APIs enable are so closely related to our data models, we initially created a single resource for each model (Django REST framework provides a ModelSerializer class that made it easy for us to do this):

customers/resources.py

class OrderResource(serializers.ModelSerializer):
    class Meta:
        model = Order
        fields = (
            'id',
            'subtotal',
            'commission',
            'tip_amount',
            'items',
            ... etc. ...
        )

This was the obvious way to go about connecting our data to our internal APIs, but we quickly ran into a serious issue: each of our apps needed different information from the same model. For instance, we didn’t want our delivery drivers to see the tip_amount because we want our drivers to treat all customers equally. Another example is that we didn’t want to expose to the customer the commission that we’re getting from the restaurants. What we were starting to do to account for these nuances was to include all the information in the serialization of the order and have each app be responsible for what data to show. Yet, such a solution required all our apps to be responsible for knowing what data they should be accessing. This created a serious privacy concern for us, so we quickly abandoned this one-model-to-one-resource approach.

The Right Approach

We learned that in order for DoorDash to be able to continue to safely and effectively share data across different apps, we needed to enforce modularity of the API-accessible data between applications. Thus, for each model, we created a separate resource for each relevant application:

customers/resources.py

class CustomerOrderResource(serializers.ModelSerializer):
    class Meta:
        model = Order
        fields = (
            'id',
            'subtotal',
            'tip_amount'
            ...
        )

drivers/resources.py

class DriverOrderResource(serializers.ModelSerializer):
    class Meta:
        model = Order
        fields = (
            'id',
            'subtotal',
            ...
        )

restaurants/resources.py

class RestaurantOrderResource(serializers.ModelSerializer):
    class Meta:
        model = Order
        fields = (
            'id',
            'subtotal',
            'commission',
            ...
        )

It seemed a bit cumbersome at first to have three separate resource classes for the same model, but it paid off big time. Prefixing each resource with the application they’re meant to be used with (reinforced by Django’s inherent application directory structure) made it easy for our engineering team to relate the resources to the correct app.

We also namespaced our API urls to reinforce modularity between applications even further. For instance, accessing an order from the customer-facing API would use api.doordash.com/customer/order/<id>/ whereas the driver-facing API would use api.doordash.com/driver/order/<id>/.

With the separate API resources and the namespaced URLs, we’re now assured that all our apps are accessing the data that they’re supposed to be seeing. Plus, by embedding the data-access policies within the resource layer, privacy is automatically enforced by the API infrastructure.

We could have used fine-grain permissions logic to filter out the information that we didn’t want to expose, but it would have created unnecessary complexity for our entire back-end. We’ve found that treating each application as its own module made our lives as developers a lot easier.

tl;dr

  • Don’t tie your API representation directly with your data models
  • Different applications have different information needs
  • Use a layer of abstraction (API resources) to hide your internal data representation from the users of your API
  • Use multiple API resources for the same model (or database table if you’re not using ORM) to help enforce modularity between applications
  • Namespaced URLs explicate modularity between applications
  • Making data-access policy choices at the API resource layer allows privacy to be automatically enforced by the API infrastructure
  • Similarly, enforcing modularity eliminates the need for complex permissions logic
  • Treat your internal APIs as if they were external-facing — it will force you to stick to good RESTful design principles

We by no means know all the answers when it comes to RESTful API design/implementation, so if please contact us if you have any feedback on this blog post.

Interested in helping build out RESTful APIs for the world’s first on-demand delivery company? We’re hiring!