The case against async relationships in Ember Data

by Ryan Toronto

November 17, 2017

The case against async relationships in Ember Data

If you've spent any time around Sam and me, you've probably heard us talk about our frustrations with Ember Data's async relationships. In this post I'll outline why we think using { async: true } is so problematic. I'll also discuss the patterns we use to fetch data for related resources in our own applications.

// app/models/post.js
export default DS.Model.extend({

  // By default, all relationships are async
  comments: DS.hasMany('comment')

});

First, "async relationships" is a loaded term.

Many people hear async relationships and think of the UI pattern of lazily loading or displaying data. For example, if I have a post model with many comments, I may load the post first and then later load the comments when the user scrolls down to the bottom of the page. This is an important pattern for any UI developer to know. It keeps API payloads small, and loads data only when users need it.

In Ember Data, however, async relationships mean something very specific. When defining a relationship as async, you're telling Ember Data that whenever this relationship is accessed, it should always be accessed asynchronously. While asynchronous access was intended to help developers who were employing the UI pattern of lazy loading, the API changes end up affecting most of your app – and this is where the problems arise.

The primary API difference between a sync and an async relationship is what happens when accessing that relationship.

let comments = post.get('comments');

In a world of sync relationships, comments in the snippet above resolves to whatever Ember Data knows about in its local store. It could be an empty array, or it could be an array of comment models.

In a world of async relationships, comments resolves to a promise proxy object, which is an object that represents a remote model or collection. The promise part means that these objects are thenable - they have a .then method, and resolve asynchronously at some future point in time. The proxy part means that, under some circumstances, they behave the same way as the underlying model or collection they represent. In addition to changing the return value of the relationship, async relationships may also trigger network requests.

Our main gripe with this API is that it combines the concerns of local data access with remote data fetching into a single method call. This makes it harder for developers to write expressive, intention-revealing code, especially because data loading is such a non-trivial part of developing Ember applications.

What does this code do?

If you're working on an app that uses async relationships, and you come across this code

let comments = post.get('comments');

what do you think it's doing? What was the developer who wrote this code trying to accomplish?

The code could be making an AJAX request to fetch the post's comments for the first time, returning a promise. It could also resolve immediately with an array of the post's comments that have already been loaded, and then make a background request to update that array. Or, perhaps the developer who wrote this line was just wanting to get the post's comments that were already loaded a few seconds ago by the router, not intending to trigger a background refresh at all.

Whatever the situation, the code is the same. It's not clear to the reader what the original developer's intentions were.

Ember's get method is used all over the place in our Ember apps, and the vast majority of the time it's used to retrieve local properties off of objects.

post.get('title').toUpperCase();

With async relationships, calls to .get can now introduce asynchrony into our applications in subtle ways. Asynchronous code is one of the things that makes JavaScript UI development so challenging, and async relationships will inevitably lead to surprises. Asynchrony should be dealt with explicitly and deliberately in our applications.

Here are some of the common traps people fall into when using or trying to work around async relationships:

Calls to get inadvertently kicking off numerous AJAX requests
Computed properties making AJAX requests, turning them into async computed properties with additional potential states
Forgetting to follow a get of an async relationship with a .then when accessing data that's already been loaded
Reaching for get('relationship.content') as an escape hatch to ignore the promise behavior of an async relationship
Triggering the "n+1" query bug by accessing relationships in a template loop, for example with {{#each author.posts as |post|}}...{{#each post.comments}}
FOUC due to a template being coded without the developer considering the states in which an async relationship like {{post.comments.length}} can exist

We can avoid all of these problems by using sync relationships.

Loading related data with sync relationships

If we use sync relationships everywhere, get will only ever return what's in Ember Data's local store.

// app/models/post.js
export default DS.Model.extend({
  comments: DS.hasMany('comment', { async: false })
});

// elsewhere
post.get('comments'); // local data access, always

Having a guarantee that writing post.get('comments') in JavaScript or {{post.comments.length}} in Handlebars won't trigger any side effects is a big benefit for developers – synchronous templates and computed properties are straightforward and predictable. And if they do forget to load something, instead of having the template's rendering pass try to do it for them, the developer would just see missing data and go load it explicitly themselves somewhere, perhaps in a route or a component hook using an Ember Concurrency task. Note that this mirrors the pattern we're all familiar with when loading a route's primary data: if the {{model}} property is unexpectedly empty, the developer simply goes back to the model hook and fixes the data-loading issue there.

Now, if post.get('comments') always returns what's in the store, but a post's comments haven't been loaded yet, how does the developer load them the first time around? Fortunately, Ember Data has plenty of existing APIs to help us out.

We can write a new method whose sole purpose is to load related data.

// app/models/post.js

export default DS.Model.extend({
  comments: DS.hasMany('comment', { async: false }),

  loadComments() {
    return this.store.query('comments', {
      filter: { post_id: this.get('id') }
    });
  }
});

This pattern keeps data loading explicit and distinct from local data access, and gives us full control over how the related data is loaded.

The loadComments method above assumes our backend responds to URLs like /comments?filter[post_id]=1. But what if our backend uses the links feature of JSON:API? We can use the references API and let Ember Data do the hard work for us:

loadComments() {
  return this.hasMany('comments').load();
}

Or, we can reload the post and use JSON:API's include feature:

loadComments() {
  return this.store.findRecord('post', this.get('id'), {
    include: 'comments',
    reload: true
  });
}

In all of these cases, loadComments returns a promise that developers can use to control how their UI renders while data is loading. And because post.get('comments') is a bound array, it will automatically update when loadComments finishes running. All template code and computed properties that rely on this relationship will be kept in sync without the developer ever having to worry about things like promise proxies or network requests.

Now, writing a loadRelationship function for every relationship in your model layer might seem a little verbose, and we agree. That's why we added a load() method to our Ember Data Storefront addon. This gives us a single conventional API for asynchronously loading relationships.

With Storefront, we can load a model's relationships like this:

post.load('comments');

We can also load related resources using JSON:API's dot-separated paths.

post.load('comments.author');

Explicitly loading data like this keeps our templates synchronous and easy to understand. For example, the following template

{{#each post.comments as |comment|}}
  {{comment.author.name}} says {{comment.text}}
{{/each}}

will never trigger any surprise AJAX requests, either from the initial render of {{#each post.comments}} or from the code in the loop that accesses {{comment.author}}.

Computed properties also remain synchronous and easy to work with under this pattern. Let's say we had a property that found the person who left the most comments on a post:

// app/models/post.js

export default Ember.Controller.extend({

  topCommenter: Ember.computed('comments.@each.author', function() {
    // Build a data structure with each authors' comment count
    let countsForAuthors = this.get('comments')
      .reduce((counts, comment) => {
        let author = comment.get('author');
        let key = Ember.guidFor(author);
        counts[key] = counts[key] || { author, count: 0 };
        counts[key].count++;

        return counts;
      }, {});

    // Find the author with the highest count
    return Ember.keys(countsForAuthors)
      .map(key => countsForAuthors[key])
      .reduce((a, b) => a.count > b.count ? a.author : b.author);      
  })

});

There's a couple of data transforms that happen here, and if we had been using { async: true } relationships we would've needed to be aware of calls to this.get('comments') and comment.get('author') returning promise proxies. With sync relationships, this CP always works.

Storefront is still a work in progress. There are some more goodies in there like run-time development assertions, so be sure to check back on it regularly as we continue to develop it.

All or nothing?

We've found that sync relationships work best when you're able to use them for every relationship in your project (Storefront actually has an option to enforce this). Mixing sync and async can surprise developers in exactly the same ways we were trying avoid by using only sync relationships in the first place.

We have successfully refactored existing projects from async to sync without much trouble. In fact, experienced developers who work in Ember apps tend to code their data loading and data access separately anyways, even if they're using async relationships. For example, instead of letting a rendering pass fetch their data for them, they might use includes in the route's model hook. By the time their templates render, their data loading has already been done up-front. If most data access is already done explicitly, refactoring to sync relationships is quite straightforward - and is really just enforcing the pattern of explicit data-loading that's already being used throughout the app.

The fact that most experienced developers avoid the side effects of async relationships goes to show that there's probably a better API we can come up with to encode these patterns. Async relationships end up being a footgun for beginners: they make it easy to write n + 1 bugs, and they make data loading seem like a smaller piece of Ember app development than it actually is.

We don't push back against defaults lightly. Our understanding of best practices is always progressing, and our goal here is to share how our thinking has evolved over the past two years on this subject. Whenever we see two ways of doing the same thing, we see an opportunity for the community to come together, coalesce and simplify.

By using { async: false } and Storefront's load() method, we're able to keep all of our data access synchronous, but still load relationships asynchronously as the data is needed. Having two separate APIs for data access and data loading lets us write UI code that's easier to understand. Storefront is our place to experiment with what Ember Data could look like in the future, and we welcome any feedback you might have on it.

What does this code do?

Loading related data with sync relationships

All or nothing?

Questions?