Performance -- a struggle

The Open Atrium application on which Wiscommunity is based is a big complex distribution of Drupal. It has a whole LOT of permissioning going on in the database, and page generation is sometimes slow and painful.

I'm trying to work around the performance issues on the site, but it's a big hairy pile of things to deal with. Open Atrium puts a lot of load on all of the web infrastructure - piles of complex database queries, very heavy use of PHP to generate the pages, and a lot of other potential issues.

The worst side effect here is not really the slow performance of the site (only when logged in - I'll get to that later) but mostly because of (ironically) one of the projects to improve the performance. One of the ways around slow Drupal performance is to add in caching. Lots of caching. One of the continual issues with this is that although it's easy to make a CMS perform by doing caching, it mostly only works well  for people who are not logged in to the site.  Why?  Well, because if you are not logged in, every page appears identical, so we can just cache that stuff up once a page is generated once. But if you are logged in, potentially every page can be different from how it is displayed to others. So - caching breaks down there.

The site uses multiple caching mechnisms.  Much of the site caching occurs using Memcache - so much of the site is cached up in a memory cache on the server.  We're also using Varnish as an external cache for the site - though in our case I think this is largely being rendered redundant because of the final caching layer - Cloudflare. The entire site lives behind a Cloudflare reverse proxy server, which mostly serves to cache up the static resources of the site (images, css files, etc.) and to serve them out from servers all around the world (though in our case for most readers, from Minneapolis).  This is fairly complex so I will not go into it in detail.

But part of the issue is that the Cloudflare reverse proxy does a number of good things, including caching and providing a secure certificate for https connections. This is all great, but it brings with it one small problem - Cloudflare expects a fairly quick response from the web server - if the web server doesn't respond quickly enough you get the dreaded 524 error page from Cloudflare, which is really annoying - you'll probably see that error page occasionally on the site when working logged in. This is the main problem I am currrently trying to work around.

I'm doing some site analysis to see "where it hurts". I think the next step will be to get the site hooked up with New Relic to try to get more insight into where the performance bottlenecks are.

Tags

Content Visibility

Public
Groups audience
Open Atrium Section
Wiscommunity Blog
randomness