PHP

Some Quick Pointers on Improving SugarCRM Performance

Getting web applications performing optimally is a never-ending full-time job for a lot of people, including me! I manage the OnDemand environment for thousands of customers at SugarCRM. This post will walk through a few pretty simple steps that can be taken to improve the performance of SugarCRM on a Linux server with MySQL, but they mostly apply to any PHP application running on a LAMP stack.

Tuning a web application is a type of engineering, and as Rico Mariani at Microsoft puts it:

If you are not measuring, you are not engineering.
Monitoring is the key in all of the tips that follow because if you don't know how bad things are, you won't know which of these fixes help you out. Depending on the details of your situation, some can actually do more harm than good so it's important to go through these one by one, understand the implications of them, and compare you data from befor and after you make any changes.

MySQL Performance

Getting things right in MySQL is another full-time job, but there are a few quick things you can do. First of all, make sure that all of your tables are the same character set. If they are not, this can greatly slow down queries that join between tables of different character sets. Take a look in the output of

SHOW TABLE STATUS FROM sugarcrm;
and make sure that everything is the same in your Collation column. If you are using utf8_general_ci in most places, and the table my_custom is a different type, just run:
ALTER TABLE my_custom CONVERT TO CHARACTER SET utf8;

Next up is making sure that you are properly utilizing the MySQL Query Cache. This cache saves the restlts of select queries so that if the same query comes in again before changes to the table, the results can be returned much more quickly. To see the configuration for this, run:

SHOW VARIABLES LIKE "query_cache%";
If query_cache_type is set to something other than "ON" or query_cache_size is set to 0, you will want to change these! 32MB is a fine start, but depending on your workload you may need (a lot) more to get the full benifit from this:
SET GLOBAL query_cache_type = 'ON';
SET GLOBAL query_cache_size = 32000000;
This one change can make a significant difference in performance. To tune your numbers, check the output of:
SHOW STATUS LIKE "Qcache%";
After some time running with the Query Cache on, you should see hits go up and lowmem_prunes should stay pretty low with free_memory not reaching 0. If free_memory seems to run out, or lowmem_prunes continue to climb, increase the size of the query_cache_size. There will continue to be lots of inserts and not_cached due to constantly changing tables and queries other than SELECT queries, so don't worry about those numbers.

Generally, using the InnoDB storage engine instead of the default MyISAM storage engine will help out as well, but any tables that you want to be able to do full-text search on cannot be InnoDB, and there is a lot more to take into consideration here than any quick fix. If you are using all MYISAM tables, this would be a good thing to investigate. Switching to InnoDB and tuning your MySQL storage engine based on your workload are a lot more than I can fit here today, but there is a lot of information out there about these. Head over to the MySQL Performance Blog for more information than you ever wanted to know about MySQL performance (including an excelent article on the MySQL Query Cache).

All of your MySQL STATUS values should be included in your monitoring system, as graphs of these are very helpful over time for diagnosing problems and tuning cache sizes! Also, remember that the Query Cache uses up RAM so keep an eye on that as well. In this case, it will use 32MB and as you increase the query_cache_size, you are directly increasing RAM usage.

PHP Performance

Perhaps the single biggest perfomance enhancement that you can make for SugarCRM and many other PHP applications is OpCode Caching. I've written about some of the complexity involving this before at SugarCRM and Caching with APC, but the simple version is that OpCode caching will improve the performance of your PHP applications. At SugarCRM, we use APC which is pretty straightforward to install. On CentOS, and other rpm/yum Linux distributions simply:

yum install php-pecl-apc
or
pecl install apc
Then, enable apc by adding it's configuration to /etc/php.ini or /etc/php.d/apc.ini, and make sure that apc.enabled is set to 1 and apc.shm_size is set to something reasonable. For one instance of SugarCRM, the default of 32 (for 32MB) is generally plenty but if you have other PHP things running on the server, a bigger number might help.

Copy /usr/share/pear/apc.php to somewhere in your webroot and visit it in your browser, and after a bit of usage of SugarCRM, you should see the "Hits" percentage climb while the "Misses" percentage gets smaller and smaller. Your hit ratio should be higher than 98%. If the "Cache full count" number is more than a very very small percentage of your hit counter, you should increase the size of memory available to APC. On some Linux systems, shm segment size is limited to 32M so instead of increasing shm_size, you will need to increase shm_segments.

Like MySQL, these numbers should be part of your monitoring system so that you know when it's time to change things. Also like MySQL, this directly uses RAM. This example will use 32MB, and increasing it will use more.

Client-Side Performance

Why bother serving files to clients that never (or seldomly) change? A common industy best practice is to set a Cache-Control header for static content to 10 days in the future. Tools like YSlow will let you know if you are not doing this. (They will also give you a lot of other pointers on perfromance! We have internal bugs open at SugarCRM for most of the client side performance issues that tools like YSlow have identified.)

We use apache on our application servers, and to enable these cache-control headers we added this:

<FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf|xml|txt)$">  
Header set Cache-Control max-age=36000,public
</FilesMatch>
For SugarCRM OnDemand, this cut our number of requests-per-second in half and made a several-hundred-millisecond improvement in the time it takes to load pages from SugarCRM OnDemand.

For this one, monitor your req/s and check your PHP Applications after any changes using a tool like YSlow.

SugarCRM Specific

Lastly, a few specific changes just for SugarCRM:

  • Add 'disable_count_query' => true to config.php. This replaces the "1-5 of 200" with a "1-5 of 6+" and makes a big difference when you have a lot of records. This configuration setting should be a default in a future release of SugarCRM.
  • Disable explicit caching in the app by setting 'external_cache_disabled' => true in config.php. While SugarCRM can use external caches like APC's user cache and Memcached for some things, this shouldn't be enabled unless you have set up the infrastructure to handle these. This configuration setting be a default in a future release of SugarCRM.
  • If developer mode is enabled and you are not using it, turn this off! In SugarCRM 6.0, this is under Adminstration -> System Settings -> Advanced. This is off by default, and if it gets turned on and accidentlally left on, it causes severe performance degradation.
  • Turn loging down to the minimum that you need. If your SugarCRM installation is on a NFS based document root with multiple servers, this becomes even more important. This is under Administration -> System Settings -> Logger Settings. "Error" or "Fatal" are good options here.
  • Turn off tracker features you are not using. This is in Admin -> Tracker. Each enabled piece of functionality causes things to be written to the database, and will slow down page loads. Leave on only what you need for reports that you have written.

Thats a start!

With those changes, you should be off to a pretty good start at improving the performance of your installation of SugarCRM! These things help us keep SugarCRM performing quickly in the SugarCRM OnDemand environment with a pretty high ratio of active users to servers, and a recent application of these tips that I did to a site that handles 200k+ unique users per month lead to a 65% decrease in system utilization and much improved response times.

[Cross posted on developers.sugarcrm.com.]

PHP Security, Round 2

As I've noticed from watching hits on my site here, many of you have read my page on PHP security using mod_fastcgi and suexec. The logic on that page still holds, but Gentoo decided to make the switch from mod_fastcgi to mod_fcgid and it broke all sorts of things for me. I got things scratched back together without any security on my old server, and with the installation of my new server a few weeks ago, I set things up more securely again. I still think this way is the way to go for a server where many of the virtual hosts will seldomly see traffic, but if you're running lots of high traffic sites and have a little bit of RAM overhead, you might want to check out this article on mpm-peruser.

For this setup, I decided to stick to some standards. This means no more changing the suxec directory, using /data/, or anything like that. Other than that, the key differences from last time:

  • All configuration is now done with with a setup script instead of using a mysql database. There was not really any point for the host names to be in a database, and it makes setup/teardown scripts easier to write as just a bash script.
  • Some hosts have PHP, some don't, so no point in setting up all the overhead if a host isn't going to use PHP.
  • Most hosts won't have any interest in having their own logs. Statistics can be done using client side things such as Google Analytics, and Apache is happier writing all the logs to 1 place instead of hundreds. I also have split-logs running when logs are rotated, so logs can easily be gathered per-site as needed, just not real time by one of my hosting customers. I've never known of one of my customers using live access to their logs.
  • php.ini files are now stored with the wrapper script in the site's cgi-bin directory and file system extended attributes are used to protect it. This means no separate home for php.ini files, and it's easier for users to see what their PHP confguration is.

The script isn't quite ready for sharing yet, but here's what you can do to get a setup like this:

  1. on Gentoo, make sure your USE contains: suexec, apache2, cgi, fastcgi, session.
  2. on Gentoo, "emerge apache php mod_fcgid". On other platforms, consult your docs (or just download mod_fcgid and use apxs to install it. it should be pretty seamless)
  3. Set up your global configuration. On Gentoo, this is done for you, but make sure this gets loaded into your global apache configuration:
    LoadModule fcgid_module modules/mod_fcgid.so
    SocketPath /var/run/fcgidsock
    SharememPath /var/run/fcgid_shm
    
    <Location /fcgid>
    SetHandler fcgid-script
    Options ExecCGI
    allow from all
    </Location>
    
  4. Add a user and group for your first virtual host, test.example.com. call em "example" if you like
  5. Set up the directory tree for the virtual host:
    /var/www/test.example.com/
    /var/www/test.example.com/tmp
    /var/www/test.example.com/htdocs
    /var/www/test.example.com/htdocs/cgi-bin
    
  6. Make some files:
    
    hello HTML world!
    
    <? 
    /* /var/www/test.example.com/test.php  */
    print("hello PHP world!");
    ?>
    
    #!/bin/sh
    # /var/www/test.example.com/htdocs/fcgi
    PHPRC=/var/www/test.example.com/htdocs/cgi-bin/
    export PHPRC
    PHP_FCGI_CHILDREN=2
    export PHP_FCGI_CHILDREN
    PHP_FCGI_MAX_REQUESTS=25000
    export PHP_FCGI_MAX_REQUESTS
    exec /usr/bin/php-cgi
    
  7. Copy your php.ini to /var/www/test.example.com/htdocs/fcgi and edit it so that directories are right. All you'll likely need to change is upload.tmp_dir and session.save_path, but you may want to change others.
  8. Set fcgi to be executable, and make sure permissions are set on it so that it is owned by your test user/group and other users can't mess with it. If things don't work later, this is a frequent culprit
  9. Set the immutable bit on php.ini and fcgi (you'll need to be using extended file system attributes on your filesystem to do this, check your OS documentation for details) by running 'chattr +i /var/www/test.example.com/htdocs/*'. You'll need to undo this with chattr -i if you want to change these files in the future.
  10. Set up this host's configuration:
    <VirtualHost *:80>
            DocumentRoot /var/www/test.example.com/htdocs/
            ServerName test.example.com
            SuexecUserGroup example example
            <Directory /var/www/test.example.com/htdocs/>
                    Options +SymLinksIfOwnerMatch
                    AllowOverride All
                    Order allow,deny
                    Allow from all
                    DirectoryIndex index.html index.php
                    AddType application/x-httpd-fastphp .php
                    Action application/x-httpd-fastphp /cgi-bin/fphp
            </Directory>
    
            <Directory /var/www/test.example.com/htdocs/cgi-bin/>
                    SetHandler fcgid-script
                    FCGIWrapper /var/www/test.example.com/htdocs/cgi-bin/fphp .php
                    Options +ExecCGI -Includes
                    allow from all
            </Directory>
    
    </VirtualHost>
    
    <VirtualHost *:80>
            ServerName aerospace.com
            Redirect Permanent / http://test.example.com/
    </VirtualHost>
    
  11. Give apache a restart and that should be it!

Check out the processes running on your server, and after you hit test.php you should see a php-cgi process running as the example user. If you have problems, error_log and suexec_log in /var/log/apache2/ (or /var/log/httpd/) tend to tell you everything you need to know.

An oh yeah, want to use APC to speed up your PHP applications significantly under this setup? Just install APC, then add the configuration for it to the bottom of the php.ini for any hosts that you want to enable this on. Given that APC isn't 100% perfect and crashes sometimes, the beauty of the fcgid setup is that it will take out the php-cgi process and the fcgid manager will just start a new one like nothing happened.

APC

No, not American Power Conversion, but the Alternative PHP Cache. It's not real obvious as to what it does from the website, but if you're doing any serious PHP application stuff, you should take a look at them (and Zend and eAccelerator). I was helping benchmark some things for work and it's amazing the difference in performance that these make compared to a standard PHP installation.

Each of them is or has a PHP OpCode cache. This means that instead of compiling the PHP from the source code on every page request, things get cached and the web server doesn't need to talk to disk as much. Just installing APC on my two larger web servers has made an amazing difference. CPU utilization is down, memory usage is down, and average response times are up. On Pudge which hosts a lot of sites running a lot of applications (including this site), there is over 150Mb of things in the cache. Aurora just hosts Faster Mustache which is only running Gallery and Drupal, and it's cache is around 45Mb.

Supposedly Gallery doesn't work quite right with APC, but I haven't had any problems with it, and the web server process on Aurora crashed once over the last few weeks after APC was installed, but I don't know if APC was the culprit or not. I'm looking forward to setting up fastcgi with suexec and APC to see how well it does with lots of virtual hosts running, and hopefully there will be a new server in the mail in the next month or two for me to do that on..

PHP on IIS -> fast and supported

Jesse and I went to the Microsoft Web Developer Summit a few weeks ago representing Gallery. Microsoft promised us that they were trying to be better citizens in the PHP community and make PHP work better on Windows. For those of you not in the know, the common way to run PHP on IIS was through CGI, which means that every single visit to a PHP page requires loading the entire PHP stack. Think starting up your web browser from scratch every time you open a new URL, and multiply that slowness by a few thousand users... it's baaaad. To get around this for PHP (and other CGI scripts) FastCGI came along and made everyones servers perform better. It daemonizes PHP (or anything else supported) so that it doesn't need to start up for every request, but it's never worked quiiiite right on IIS and has never been supported. Until now.

So today, after working with Zend for a while, Microsoft has released FastCGI today for the current version of IIS, with full Microsoft Support. Have a PHP app that you are trying to get to work in IIS? They'll help you. Cool! You can download FastCGI for free from www.iis.net/php. (And from what I understand, it really is FastCGI so you can use it for any CGI based application server in IIS.)

Have a Windows server running IIS? Want to use Gallery? Check out the howto that Microsoft wrote on Installing Gallery 2 on IIS. In addition to writing some docs, over the next few months Microsoft will supposedly be getting us resources like virtual machines, licenses, hosted test environments, etc so that we can make Gallery work on a wider range of Microsoft products.

A little Kool-Aid is definitely good every now and then.

Microsoft Web Developer Summit 07

Sunday to Tuesday of this week, Jesse and I represented Gallery at Microsoft's invite only 3rd annual Web Developer Summit. This year their focus was on PHP, and 24 "important" people from the PHP community were invited. The two authors of the book "Pro Drupal Development" that are also Drupal core developers, several people form the PHP (and PEAR) core team, an engineer from SugarCRM, and the guy at Facebook that wrote their developer platform were among the other attendees. On the Microsoft side were the important "higher ups" that work with Open Source technologies (see microsoft.com/opensource).

Sunday was my flight out and a few hours of catching up with an old friend I haven't seen since middle school, followed by meeting up with Jesse for some beer samplers at a local brewery, followed by some snacks and drinks with the rest of the summit attendees.

Monday was a day full of sessions followed by dinner, and Tuesday was more sessions. You can read someone else's presentation notes here so I'll keep this to highlights:

  • Pictures! Jesse's blog where he talks a lot more about some details
  • Microsoft is seriously interested in Open Source now. They've realized that their value is as a platform, and PHP applications need to work well on Windows for people to be willing to use Windows in may server environments. Sure, it's just business because thats what their customers demand, but they're ready to work with us to do what needs to be done.
  • Monday after dinner (and after Jesse ordered a round of tequila shots for the entire conference on Microsoft's tab), I spoke with Sam Ramji, Microsoft's Director of Platform Strategy, for an hour or so. He runs the Open Source Software Lab at Microsoft and was very interested in any ideas that I and the other attendees have for Microsoft.
  • Surprisingly, many of the ideas we had were news to them. It seems that there are a lot of things in the community that we all think Microsoft should do, but nobody ever goes to the trouble of telling them! Things like experienced Linux admins wish their BASH and Apache configuration skills could transfer more directly to PowerShell and IIS, etc. I don't remember all of these because they seemed so obvious, but they took notes and hopefully will get around to doing some of these.
  • Jesse and I were convinced at the last minute to do a talk on Gallery. We quickly put together a presentation and some people at least seemed pretty interested. We got a healthy number of questions and I looked at some number I haven't looked at in a while (Gallery gets ~150k downloads a month!). As a result of this, we may have 2 people contributing some code and one person starting on some more documentation for us. Hoorah.
  • While some of the presentations were pretty useful, much of it wasn't really targeted to the audience. Sure, learning about Silverlight and Expression were neat, but were they really the best use of our time? Probably not. However, internet worked well and it was easy to get other things done during the less interesting parts. (And most people in the room were on IRC so we could discuss things at the event as they happened.)
  • We were not given suitcases full of cash to use ASP.net, but each of us walked out with a full MSDN Subscription and Microsoft is going to be working with us to provide whatever products and licenses we need to be able to effectively develop for and test on Microsoft platforms including IIS7, Windows Server 2003 and 2008, Microsoft SQL Server, etc. No complaints there. (One attendee just hasn't gotten around to publishing something she's been working on yet, and it sounded like Microsoft will be shipping her an XBox 360 to encourage her to get around to it :) )

All in all, it was a pretty useful couple of days. I think the most important part was networking with other PHP developers and the Open Source people at Microsoft, as this should encourage future email conversations with everyone to be timelier and more effective. Hopefully, everything will come through and Microsoft will be able to provide Gallery with what we need to be able to test and develop on Windows, and hopefully Microsoft will be able to implement some of our suggestions for the way they work with Open Source. They do seem very interested in making this happen! If you want to read more from them (which you should, especially if you think I've just been drinking their KoolAid all week), check out: port25.technet.com and microsoft.com/opensource.

Transparent TCP Stream Encryption

School this semester has been very busy, but things are getting done and I'll post the neat things here as I get finished. The first project I completed this semester was my group project for CS6250: Advanced Computer Networks aka Router Architectures and Algorithms.

My group, consisting of Me, Zack A, and Chris L (a usual in my CS groups) designed and implemented ZCC: a set of very easy to use and manage tools for encrypting TCP traffic. The end result didn't turn out quite as we originally planned, but the key functionality is there and we were all very excited about it. (For example, it only works with IPv4 currently) Chris L and I are actually hoping to build on this next semester as a project in CS 7260: Internet Architecture & Protocols.

The neatest part of this project for me was becoming very familar with the ip_queue kernel module and libipq which seems to be something that almost noone is familar with. It's manual page here hasn't been updated in a while, and the successor to it: "libnetfilter_queue," (which we found out too late to use) is only available in very new Linux kernels (2.6.14 or newer). If you Google around for libipq, you'll see that we didn't have a lot to work with or examples to go by. (We were trying to modify the packet data before pushing it back into the queue while most examples just show how to accept or drop packets)

Read on for the details..