Clarence Eldefors blog Mostly about the web and technology

8Sep/130

Transform a dying LAMP stack to a powerhouse

There are many LAMP stacks of a few to a dozen of servers around that have serious troubles with performance, troubleshooting and manageability.

To give a clear example. Let's think of a setup with 10 apache servers serving static files and PHP, a mysql database server and a haproxy load balancer in front. Assume that there are 5 sites with different domains load balanced across the servers. Some common problems in a setup like this, and possible solutions are:

Problem

Apache performance is mediocre at best.

Solution
Most probably 2-4 of the apache servers can be removed/reused for something else by using a well configured lightweight web server. The two most common choices are nginx and lighttpd, in that order. In my opinion they are both stable and pretty equivalent in performance - but I do like the flexibility in nginx configuration a bit better.

Problem

No good cache system utilized makes each and every PHP request come with a big overhead.

Solution
Changing the plain load balancer into a caching one can come with great performance gains. Varnish is now probably the first choice for most people and is both very stable and performs well. By setting an expiration time from your backend (web) servers you can control exactly how long each file should be cached. Put some time and effort to achieve high hitrates and it is probably the most important change you can do.

Problem

No good way to identify bottlenecks or performance troubles in the PHP code. With many recent releases over several sites in the web cluster it's very hard to know the reason for overloaded servers.

Solution
Having a live profiler for PHP in your web servers is a great win in this situation. If you want something free there is XHPROF that was open sourced by Facebook. Unfortunately they did not release their web interfaces that made it useful in production. There are however two open source solutions for that Xhgui that logs into a mongodb database and XhProf Analytics that logs into mysql. Just take caution on where and how you use it as it's quick to fill up your disk with the profiler results ;) A good setup could be to log only a percentage of the requests and then to empty it from old data (with an index for mongodb and a cronjob for mysql).

If you don't mind paying a bit there is also a hosted solution called New Relic with a nice intuitive interface. They do more than just PHP as well.

Problem

No good way to get an overview of where errors happen. To read the PHP error logs you need to log into each of the machines.

Solution
Use a centralized logging service.

If you want to roll your own there's a very good solution in Graylog2 together with Logstash for collection. Graylog2 provides a user interface with search, log streams and alerts and Logstash provides an easy to use client for parsing and forwarding log file messages to the Graylog2 server.

For hosted solutions the most mentioned service is loggly. Beware that it might become costly depending on your logging scale.

Problem

MySQL uses MyISAM and it keep causing locks.

Solution
With row level locks instead of table level InnoDB mostly comes as a must when you get alot of writes to some hit tables. It will also in general be faster for an average setup with it's buffer pool caching of data, not to mention it's primary key partitioning. Be sure that you have a large enough buffer pool (and suffificent memory for it) and it will do miracles for your database reliability.

Other things to consider

Are you using the right database system?
There are many cases where alot of your database load can be transferred into a more lightweight system. A common situation is that many simple data structures can be moved from a somewhat heavy RDBMS (say MySQL) to a more lightweight NoSQL database such as Redis. This can increase the performance significantly if used in hotspots with simple data structures.

Revisit your database load
Database loads are a very common bottleneck and is most often the hardest part of your application to scale. By periodically checking up on what is causing the load there are usually many easy optimizations to be done. pt-query-digest is an excellent tool for profiling your database load and finding the heavy queries.

Caching effiecency
Measure your performance. See what causes the most load on the servers with xhprof/new relic, varnish cli tools, pt-query-digest etc. You will surely find things within the top requests and queries that can be more effeciently cached.

This list is far from complete but I think and hope I managed to get the most common and important points down.

25Oct/120

PHP: Structure, cooperation and good practices – part 1

PHP land is very often a messy place without standards and structure. Well, that does not have to be any more. Over the years alot of tools, standards and practices have been emerging to make it a more structured place - for the good of all of us PHP developers.

Use a standardized autoloading: PSR-0

The pro of standardized autoloading is in one word interopability. It makes it easier to add source code and libraries from different sources into your project as the autoloader in your and the 3rd party source code will be compatible.

An overview of PSR-0 with links to follow for sample code and information is available at: https://github.com/php-fig/fig-standards/blob/master/accepted/PSR-0.md

Handle libraries with composer

Composer is a great tool for managing libraries. The package information is so flexible it's easy to use it with most libraries already around and completely without fuss to include libraries that added a small and simple composer.json description file to their repository.

If you start up a new project and want to include Monolog and Twig into your application; you just write the requirements up in your local composer.json file to include them like this:

{
    "require": {
        "monolog/monolog": "1.2.*",
        "twig/twig": "v1.9.2"
    }
}

The value of the package is as you can guess a version tag to set which versions you are OK with. When other packages also needs a specific version of the same library this is very useful to see that both your project and the other package are compatible to the same dependency version.

To install the packages you first download the composer PHAR file and then you just run "composer.phar install".

The package information comes from the public package repository at http://www.packagist.com. It is also easy to set up your own repository with Satis (packagist is also open source, but a bit overweight for simple inhouse repositories) and there's support for PEAR as a repo as well. If you want to add zip files or VCS repositories without the composer.json - thats also easily manageable with the package-repository in your own composer.json (see http://getcomposer.org/doc/05-repositories.md#package-2)

To get started; head over to http://getcomposer.org/

Use a coding standard

When working together with other people or companies it's important that everyone writes code the same way. The variable and function names should have the same casing, the spacing and indentation should be the same, opening braces should be at the same place.

The biggest reason is not that one way is the best to write it but that it should be easy to write code and easy to read code. If you learn to read and write code in one way, and have to change every so often during the same workday - your productivity will diminish.

There are a horde of standards out there. PSR-2, Zend, VG etc etc.

My suggestion is to follow the PSR-2 standard as that has been formed by a wider group of people and companies and is therefore easier to adapt into different types of projects.

23Jul/110

Send data to include files with dwoo

Sometimes it is wanted to send some data, like title or an id to an included template with Dwoo. Even though described in the manual I did not first look there because I mostly don't find what I look for there. So in case anyone else have the same bother; this will solve it:

{include(file='elements/google-like-button.tpl' url='http://www.eldefors.com')}

This will allow you to use $url in the included template just as if it was a local variable.

More features that one might miss or look too long for in the docs:
Scope $__.var will always fetch 'var' variable from template root. This is useful for loops where scope is changed.
Loop names Adding name="loopname" lets you access it's variables. inside a nested construct by accessing $.foreach.loopname.var (change foreach with your loop element).
Default view for empty foreach By adding a {else} before your end tag you can output data for empty variables passed without an extra element.

What is Dwoo some might wonder. It is a template system with similar syntax to smarty. It has however been rewritten alot and is in my experience working great both performance-wise and feature-wise. It is very easy to extend and the by far most inexpandable feature to me is that of template inheritance. This lacks in most PHP templating systems but with Dwoo you can apply the same thinking as with normal class inheritance. Since many elements on your page are the same on all pages and even more in the same section; you can define blocks which you override (think of it as class methods) and for instance create a general section template that inherits the base layout template and then let every section page template inherit this.

25Feb/100

A first glance at hiphop

Facebook recently released their PHP on steroids named HipHop as open source. I listened to their presentation a while before at the FOSDEM conference in Brussels and was as many others impressed - but not as entusiastic as many others seem now.

Some say it's nothing new because there has been a small amount of PHP compilers before or because there are op-code caches already. What HipHop does however is not only to compile the code base to C++ but also process the code in several stages - to use as specific data type possible for instance. Facebook engineers are saying that they see 30-50% performance improvement over PHP that is already boosted by APC. Indeed that is a huge deal given the amount of application servers they use.

On the other side a lot of attention it has been gotten is almost the same as that of APC. It´s seen as a general purpose performance booster. But as with APC results for most people will be disappointing for the reason that most of the application time is not spent in the PHP code with most websites.

Facebook is indeed special compared to most websites. For instance they generally do no joins of data at the database level. That results in alot more data in the application as well as more basic application logic.

The reason they often chose to totally exclude joins are several. Amongst others it´s performance draining for the database servers which are generally harder to scale. It´s also very hard to do when you need to query a whole lot of servers (that can also be sharded by different factors) for each and every type of data you want to join.

HipHop is made for the giants by a giant. The huge sites with a lot of traffic that have big amounts of data to do queries against. Smaller sites will have much less benefit from the performance boost of it as most of their time is spent in databases, caches, reading from disk etc. The results will also vary greatly depending on how much of the code base of the website is mundane (basic constructs like loops, processing with non-dynamic variables etc). Basically everything that can be rewritten with static functions and variables are the most welcome targets to the HipHop optimizations.

Even with some 10 commodity application servers I would say that Hip Hop should give little enough performance boost to justify the time that needs to be spent to learn, test and maintain the framework. For Facebook though, I can certainly see how it's very welcome even with several man-years of development costs as they have thousands (?) of application servers and have good reasons not to drop PHP in most heavy parts of it.

Tagged as: , , No Comments