Quick guide to Linux daemons in PHP

This guide was originally written for a PHPNW meetup – more information on the group is available here

Whenever we want to carry out lengthy tasks or process large batches of data, we should consider executing that process in the background. The most typical way of doing this is through “cron jobs” – tasks which are scheduled to be executed at regular intervals; every 2 minutes, or at 6am every Sunday. Alternatively, you can have the web browser instigate the task by using something like shell_exec(). This guide presents a third alternative, system level daemons.

Introduction

Put simply, a daemon is a very long running process. Even when it has completed all its tasks, it stays resident, waiting for some work to do. They may spend a lot of their time sleeping, but if designed correctly, they spring to life when required. Of course, they may be constantly listening out for events (by reading a named pipe for example), rather than periodically checking to see if work is available.

Why is a daemon better than a cron job

For the same reason that web pages are not served by a cronjob running every minute. Daemons are designed to be much more responsive. Cron jobs are very reliable, but sometimes they are less “elegant”.

Why is a daemon better than shell_exec()

Irrespective of whether your process is triggered by an HTTP request or some other event, it might not be appropriate to create one process per request. If your process requires considerable resources, then you may end up swamping the server.

Broadly speaking, daemons can come in two flavours:

  • Single background process
  • Multiple background processes managed by a parent

Single background process

The parent creates a copy of itself (a process known as “forking”) and then dies. This leaves the single process running in the background. This type of daemon is very easy to manage and develop.

Multiple background processes managed by a parent

The parent creates a child, and then dies. That new child then creates multiple children. Each of these children is a “worker thread” and responsible for the heavy lifting. The parent of those children is responsible for distributing their workload amongst them. Apache HTTPd is a prime example of this – in the configuration you can choose how many children it can create, the minimum number of children to have available, etc.

Daemons provide a nice balance between the managed rigidity of a cron job while still being responsive.

Sample Application

I have recently found a handy PEAR library for creating PHP daemons – System_Daemon. It greatly simplifies the task of creating a daemon, although the library only supports “Single background process” type daemons.

This daemon does pretty much nothing, other than stay in the background, slowly adding to a log file. However, it demonstrates that you can have a very “lightweight” process listening for system events in only a few lines of code. It’s also important to have some element of a “forever” loop in there to keep the process going.

Signal Handlers

Signals are a very basic form of Inter-process communication. In its most common usage, you would send a SIGTERM signal to a process by using the kill command. We can trap SIGTERM signals explicitly and shutdown “nicely”. This means any work being carried out can be finished off cleanly, and then the process will terminate. We can trap many other signals, all of which can be sent via “kill” on the command line, or more interestingly, the PHP function posix_signal().

Important tip for signals

I discovered this recently : PHP’s sleep() method is interrupted by signal interrupts. This means you can have a fairly lazy daemon (e.g. sleeps for 5 seconds every interation) and when you want to wake it up, you send it a SIGTERM signal. When interrupted this way, sleep() returns how many seconds were left at the point of interruption.

System_Daemon wraps up pcntl_signal() for us into

Init.d scripts

The System_Daemon library can generate init.d scripts for us, which means (for me at least) I can and stop/start a daemon with:

Because we studiously assigned a SIGTERM handler, when we ask the daemon to stop using our init.d script, it will shutdown cleanly.

Real Life Examples

Reporting System

Reports can take anywhere between 10 seconds and 3 hours to generate. Requests for report generation were stored in a database and the Reporting Daemon picked them up and processed them. The report was written to disk and made available in a reports area. The system remained responsive irrespective of how many reports were requested.

SMS Processing

A large, multinational mobile promotions company received hundreds of SMSes a second. In order to be an effective product, the SMS request must have a response as quickly as possible. During promotional periods SMS traffic would sky rocket. Each inbound SMS was written to a database and then awaited processing by the daemon. A similar system was written for outbound SMS messages. The entire SMS platform used about 5 PHP Daemons and was the bread-and-butter of a £40m+ company.

Co-ordinating remote service requests

On a much smaller scale than the previous two examples, I wrote a simple caching daemon. The daemon would listen out for SIGINT signals (sent by a web page using posix_signal()) During quiet periods, the daemon would forward requests to the remote service as they came in. During busy periods, the daemon would batch them together. Because the daemon acted as a centralised point for communications, it was the best place to judge work load. For those interested, the web page communicated with the daemon using System V queues, with more information here.

Things to watch out for

  • The PCNTL library (required for this excercise) is only available when running PHP on the command line (cli mode)
  • You will soon notice that calls to echo or print are quite distracting when the daemon is running.
  • Going from personal experience, fatal errors can be hard to track down. Sometimes the process will die with nothing having been written to /var/log/php_error.
  • Any resources opened before forking, may not be available after the fork (e.g file pointers, db connections). Because pcntl_fork() creates a copy of the current process, when the parent is killed off, all resources are closed. The child, now holding a reference to a closed resource, will trigger an error when that resource is accessed.
  • Certain resources close after a period of inactivity (I’m looking at you, MySQL) – check that your resources are available before using them.
  • Because the script never shuts down (or shuts down rarely) memory leaks can be problematic.
Tagged with: , , ,
Posted in Linux
4 comments on “Quick guide to Linux daemons in PHP
  1. Dave says:

    Couple of additional notes:

    With PHP5.3+ declare(ticks=1) is deprecated, that means that the signal handlers are NOT registered to be called. You have to manually trigger them by adding to your primary loop the following:

    Before:
    while ( $var === true ) { // do something cool }

    After:
    while ( $var === true && pcntl_signal_dispatch() ) { // do something cool }

    pcntl_signal_dispatch() will then fire all signal handlers. See http://bugs.php.net/bug.php?id=47198 and http://www.php.net/manual/en/function.pcntl-signal-dispatch.php for further details.

    Inside your main process loop it is extremely important to check for signals before AND after your main process. That will allow you to cleanly shutdown, restart etc your daemon process.

    If processing very large XML files i.e. >100MB use xmlreader() and run that process separately NOT as a daemon; use a daemon though to control when the files are processed.

    Avoid at all costs reciprocal references in long running scripts e.g. passing $this into sub-objects so you have references between objects. You will run out of memory as the objects will never be destroyed (this is possibly fixed in PHP 5.3). Similarly avoid using simplexml foreach loops – they leak memory too. Use for loops instead (which don’t).

    As has been mentioned in the article already, and really needs re-iterating: NEVER open a resource before forking the process. Do only basic initialisation and then only when your process has forked should you create db connections, file pointers etc. This will avoid any nasty hard to track bugs with already opened resources dropping out of scope or being prematurely killed.

  2. Alex says:

    Very interesting presentation! Thanks.

  3. Tony says:

    Is there a way to manually trigger ticks in 5.2? Even Jaunty isn’t using 5.3 yet, and I don’t want to try to compile a PHP version and later have problems when I upgrade my OS. I have a forked daemon working just fine, but in one loop waiting for a newly forked child to finish, the daemon never responds to signals it’s sent. Here’s the loop that waits for the child to finish:

    while (pcntl_waitpid($pid2, $cstatus, WNOHANG) == 0) {
    sleep(1);
    }

    I’ve confirmed the loop runs fine by adding an echo in the function, yet during this process the daemon never responds to any signals sent to it.

    To clarify:
    Application – forks daemon
    Daemon – forks child process, waits for it to return. Child process assumes a different uid/gid which is why the fork.

    Daemon responds to all signals up until the child fork, at which point it won’t respond anymore.

  4. adrian says:

    Tony,

    You’re saying that the parent never responds to signals during that loop? It sounds silly, but what happens if you put some other operations in that while loop, like a for-loop? I’m not convinced sleep(1) is a “tickable” statement…

    Also, try substituting your while loop, for a while(true) loop, then call waitpid without no-hang…. see what happens then.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code class="" title="" data-url=""> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="" title="" data-url=""> <span class="" title="" data-url="">