Friday, January 30, 2009

Cooling A Hot Drupal Install

We have had problems with the company site for a month now. It's an ongoing seige. What's worse: this is the only Drupal I have that is doing this. Here are some of Drupal's deadly sins playing out:
File stat frenzy: When in doubt, Drupal does check for modules and themes. If does a file system check by looking at certain file directories. We were really getting hung by this because I was being smart and using a theme with a stylesheet not named "style.css"-- this is what Drupal looks for in the themes to define that a theme is a theme. Also, I stored stuff in subdirectories in each of these themes (images and some other CSS files). That's a no-no. Whenever Drupal crawled the themes it would also crawl those subdirectories. There is an architectural problem we have (34 active themes) that is compounded by the two subdirectories per theme; and the absence of a style.css. So, I nuked one of the subdirectories per theme; and I put in an empty style.css to placate the file check.
The modules directory is where we're putting all our modules. A number of developers reccommend putting non-core modules into sites/all/modules. I don't hold for that-- so much so that I disabled the sites/all/modules crawl from the file.inc. By default, it's one of the places to look for files. The file directory scan ignores ".", "..", and "CVS". Swell, but if you store stuff in Subversion, you may have .svn files in your production copy. So, I added ".svn" as a directory to ignore when doing the crawl.
Group sessions: Sessions for us have gone totally mental. When Google, Yahoo or other sites crawl our site, each page view spawns a new session. The session expiry functionality is faulty so these sessions pile up-- the sessions table grows and grows. Session tables in good installs look to have 2000-4000 sessions. Ours has 200,000 records on a good day: most are old, almost all of them are for anonymous users and most of them are shared by six or seven IP addresses. I have tried to prune these when I find the table has grown out of contol. Good luck. Randy Brown has a good piece on how to changes the settings.php file to make short session lifespans. I do not know if this will have a bearing. It hasn't appeared to work which may point to some faulty session end functionality.
Content Types Gone Wild: When in doubt, we add fields and content-types. We have over 500 fields in play. At first I thought I was being all smart: keep the number of fields under control to make consistency. Hah. It turns out that the multi-table joins needed to farm in data elements is a killer: it can tie up two dozen tables. I put the question to people in the Drupal Groups and I have recieved alot of great and productive feedback. Short answer: lots of individual fields is good. The exception: when you're going to pool data (event dates, for example) should have a common data field.
Tidying: I have been going through our themes and modules with an eye on two things: do we need the functionality, do we need the module or the theme? When it's not required I take it out. I know that with the hundreds of modules you can get into a PackRat mindset of gathering modules, but I have to resist that-- I've even tossed Devel when not in active use: the idea is that can re-install when I need it.

The net result: the site is still driving into a wall. This means I get chiding comments about how Drupal is no good at running large sites. I counter that with Popsci.com: it's 10x busier than our site. I also counter it with Joyent's capacity to host VERY active sites. The problem: Joyent may be a little spazzy. Now we're doing Consulting full-time, all the time

No comments: