MODX getResources: Tips & Optimizations

Apr 19, 2013

With Great Power...

getResources is the Swiss Army Knife of PHP Snippets. Arguably it's the most flexible and powerful component available in the MODX CMS, and maybe (if you can even make such a comparison) in any CMS platform out there. Here are just a few of the things you can do with getResources:

Basically anytime you want to list, sort, filter, or otherwise manipulate data stored in MODX Resources (the modx_site_content database table) getResources is your go-to Snippet. It's made even more powerful by the venerable MODX templating system: YOU control the output of getResources whether it's your own HTML, XML, Javascript, or any other front-end content. You can even join-in the MODX Template Variable tables, which means that no matter how customized your content model is, getResources can list, sort, filter and manipulate it.

All this, and you don't have to write a single line of PHP!

...Comes Great Responsibility

With all this power and flexibility at our fingertips, it's easy to get ahead of ourselves. MODX, and by extension getResources, is “infinitely configurable”, and somewhere in all that infinity are a LOT of less-than-optimal configurations.

In this article, we're going to model a hypothetical site's content using Template Variables (TVs) and we'll use getResources to display that content in a variety of ways. We'll start with the most obvious way to do it, then add layers of optimization to make our site more robust and scalable. Finally, we'll look at the limitations of getResources and discuss the conditions under which using custom queries (and maybe custom database tables) becomes far more appropriate.

Your Content, Your Way (One of my favorite MODX slogans :)

Let's say we're building a site for a local arts and culture publication. They cover music, arts, food, and music, as well as current news.
Site Tree
Each of those main sections includes reviews and listings, except for the news section which is just news articles. Their articles and reviews have essentially the same types of data:

  • Section (one of the five main topics covered)
  • Prominence (is it a cover story?)
  • Headline
  • Byline: (author info)
  • Dateline (date of the event covered)
  • Publication Date
  • Kicker (a summary)
  • Lede (the 1st paragraph)
  • Canonical URL (or "alias" or "permalink" - no URL parameters)
  • Image
  • Image Caption
  • Image Credit
  • Article Body
  • Tags
  • Related News
  • Related Media (video clips, audio, etc.)

WOW.

This is a lot of data - and compared to most real-world situations, it's actually kind of stripped-down already. Thank goodness we have MODX on our side, because it will handle this easily. Will this model scale to epic proportions when the publication becomes a global media powerhouse? Doubtful. But we'll address that later in this tutorial...For now, let's see what MODX can do for us out-of-the-box:

  • Section » By virtue of being a child Resource of one of the section containers
  • Prominence » Checkbox TV (for those who don't know, TVs, or Template Variables are custom fields)
  • Headline » Default pagetitle field
  • Byline » Default createdby attribute
  • Dateline » Datepicker TV
  • Publication Date » Default publishedon attribute
  • Kicker » Default description field
  • Lede » Default introtext field
  • Canonical URL » Default alias field
  • Image » Image TV
  • Image Caption » Text TV
  • Image Credit » Text TV
  • Article Body » Default content field
  • Tags » Autotag TV
  • Related News » Listbox or Resource List TV
  • Related Media » Textarea TV

MODX Template Variables

As you can see, every MODX Resource will have custom input fields to store the necessary metadata for each article. We can then syndicate that article and display the data for use in a limitless number of ways, because we have a "data model" of sorts. For example:

  • On the homepage, we'll have a slider that rotates through all the articles, from all the sections, that have the "Prominence" checkbox checked. They will display with a large Image, Headline, Kicker and a link to the full article.
  • In the sidebar on every page, we'll have a "widget" that displays the most recent article from each section, so long as it's NOT a cover story and NOT the article currently being viewed. A thumbnail Image, Headline, Kicker, and link will be displayed for each.
  • In each section, we'll have a listing of the 15 most recent articles in that section. They'll display with a thumbnail Image, Headline, Publication Date, Byline, Lede and a "Read More" link.
  • The site will have an RSS Feed, but not just any RSS Feed. Only music reviews that contain the word "band" will be listed.
  • Finally the site will publish an API, so all the content from every article will be converted to an object in a JSON file.

Believe it or not, you can do all of this with just the MODX core installation and the getResources snippet. I'm not going to detail every template used, all the markup, etc. That's a whole series of tutorials. What I will do is provide an example of a getResources call that could be used to produce each of the above, plus a second, more optimized example. I'll also go into some of the reasoning behind the optimizations so you can bring home the bacon to your own projects.

Homepage Slider

Unoptimized:

[[!getResources?
    &parents=`2,4,5,8,9,11,12,13,14`
    &limit=`15`
    &tvFilters=`Prominence==1`
    &includeTVs=`1`
    &processTVs=`1`
    &tpl=`homepage_slide_tpl`
]]

Break it down:

Let's start with a run-down of what the unoptimized snippet call does. This is based on bits of knowledge I've gleaned from Jason Coward, the Chief Architect of MODX and SQL/PHP guru. I myself, do not understand the inner workings of these things, but here's what my comparatively feeble brain has been able to retain on the subject.

First off the snippet is called uncached, so on every single request of the homepage, MODX will process that query anew. On each query, it traverses the site_content table looking for the parent Resources, then checks if they have children, loops through those, then checks if the children have children, loops through those, etc. By default the depth to which getResources will check for children is 10 levels. Then it queries the TV tables and returns a set of Resources that is filtered on the Prominence TV having a value of "1". It then retrieves the values of ALL the TVs that the Resources in the result set have access to, processes all the values, and finally the output is rendered using the supplied template chunk.

Now let's look at the optimized version:

[[getResources?
    &parents=`2,4,5,8,9,11,12,13,14`
    &limit=`15`
    &depth=`0`
    &tvFilters=`Prominence==1`
    &includeTVs=`1`
    &includeTVList=`Prominence,Image`
    &tpl=`homepage_slide_tpl`
]]
  1. There's no reason to call this snippet uncached, because every time a new article is published, the MODX Resource cache will be cleared anyways. Imagine that 500 visitors per day request the homepage. Instead of running the query 500 times, it runs it once and caches the results. That's optimized by a factor of about 500:1. The official documentation is filled with examples of getResources being called uncached, but they do it to illustrate that you can; the vast majority of time this is totally unnecessary.
  2. By adding the &depth parameter, getResources will not check all those articles to see if they have children. If there are 1000 articles on the site, there's 1000 entries in the database that don't have to be checked for children.
  3. By adding the &includeTVList parameter, we're excluding all the irrelevant TVs. We have 8 TVs just for this template, but here on the homepage we only need the 2: Prominence and Image. If the result set contained 15 Resources, there's 15 x 6 = 90 TV values we don't have to retrieve and process.
  4. While we're at it, let's just get rid of that &processTVs parameter. We only used that to output the <image> tag for the Image TV - we can easily just wrap the TV value in an image tag ourselves.

With four little modifications, we've greatly increased the performance of the homepage. But that's only one page. Watch what happens when we use a getResources in a more pervasive Template...

Sidebar Widget

Unoptimized:

[[!getResources?    
    &parents=`4`    
    &limit=`1`    
    &tvFilters=`Prominence!=1`    
    &resources=`-[[*id]]`    
    &includeTVs=`1`    
    &includeTVList=`Prominence,Image`     
    &tpl=`sidebar_article_tpl`  
]]

Optimized:

[[getResources?    
    &parents=`4`    
    &limit=`1`    
    &tvFilters=`Prominence!=1`    
    &depth=`0`    
    &resources=`-[[*id]]`    
    &includeTVs=`1`    
    &includeTVList=`Prominence,Image`     
    &tpl=`sidebar_article_tpl`  
]]

Break it down:

In this case, one might easily think that the snippet must be called uncached, because the [[*id]] will be different depending on which page this snippet is being called on. Imagine though, that you have 1000 pages on your site. This sidebar widget has five getResources calls like this - one for each section. That's 5000 queries! If you call it uncached, the queries happen on every page request. Can you see how quickly this adds up? Luckily, MODX caches the output of the snippet on a per-resource basis. In other words, the [[*id]] will always be the relevant one. In this case we can safely call getResources cached.

Until recently, I thought the &limit property made the query less intensive but this is only partially true. Even if you only want one Resource in the result set, it still needs the full data set - all the Resources - to filter and sort through. The subsequent processing is limited to the result set, so that's good. But if you have tens of thousands of Resources, you can see the &tvFilters part of the query is doing a LOT of work, and there's no real way to "optimize" that. In this case a custom query might be more appropriate.

Most Recent Articles

We've gone over most of the concepts in this one, but let's review the "optimized" version:

Optimized:

[[getResources?    
    &parents=`4`    
    &limit=`15`    
    &depth=`0`    
    &includeTVs=`1`    
    &includeTVList=`Image`     
    &tpl=`article_row_tpl`  
]]

Break it down:

The only important new thing to note here is that we DON'T use the &includeContent property. Often a listing of Articles or posts might include the content field in the output template, like this: [[*content:ellipsis=`300`]] The content column has a data type of "mediumtext" in the database, and including it in your query tends to slow down the query, sometimes significantly. Luckily, MODX comes with a built-in Resource field that's perfect for excerpts: the introtext field. It has a data type of varchar(255), which in most cases results in a faster query.

RSS Feed

Here's an interesting one:

[[getResources?    
    &limit=`0`    
    &where=`{"content:LIKE":"%band%", "AND:parent:=":4}`    
    &depth=`0`    
    &includeTVs=`1`    
    &tpl=`rss_item_tpl`  
]]

Break it down

This actually isn't the best example, because you would normally use the separate &parent property, but I'm trying to illustrate a point that Jason brought up recently: "ALWAYS filter by the most exclusive data first". So let's say you have potentially 1,000 articles with the word "band" in the content, but only 100 of those will have the parent ID "4", then the "Optimized" version of the above would be this: &where=`{"parent:=":4, "AND:content:LIKE":"%band%"}` In Jason's words, "the more you filter out before the hard part of the query, the fewer rows are scanned." Yes, the order in which you write the conditions into the &where property actually makes a huge difference!

Big JSON

This query is kind of all-inclusive. It's similar to what you'd do for a more standard RSS Feed:

[[getResources?     
    &parents=`2,4,5,8,9,11,12,13,14`    
    &limit=`0`    
    &depth=`1`    
    &hideContainers=`1`     
    &includeTVs=`1`    
    &includeContent=`1`    
    &tpl=`json_item_tpl`  
]]

Break it down:

We want to include all of our articles in the result set, so we have to disable limiting. That means the query is gonna retrieve values from all the TVs and the content field of every article on the site. If the apps that are using this JSON can have slightly less than up-to-the-minute data, then a custom cache partition that refreshes on an interval can help, especially on a site with frequent publishing. A custom query might also be useful here. Setting the &depth property to the smallest value that will return the Resources we need is the best we can do to optimize the snippet call itself.

When To Go Custom

Again, paraphrasing Jason here: "A good rule of thumb is that whenever you have hundreds of something, consider a custom data model and/or custom queries instead of the default tools." So if you publish a new article once or twice a week, using the methods described herein will be just fine, even on a busy site. But if you publish 5 articles a day, you'll likely run into scaling issues sooner than later.

Luckily, MODX makes it relatively easy to integrate custom database tables and queries...but that's the subject of another post...written by somebody else :)