Reference for proper handling of PID file on Unix

LockingDaemonUnix

Locking Problem Overview


Where can I find a well-respected reference that details the proper handling of PID files on Unix?

On Unix operating systems, it is common practice to “lock” a program (often a daemon) by use of a special lock file: the PID file.

This is a file in a predictable location, often ‘/var/run/foo.pid’. The program is supposed to check when it starts up whether the PID file exists and, if the file does exist, exit with an error. So it's a kind of advisory, collaborative locking mechanism.

The file contains a single line of text, being the numeric process ID (hence the name “PID file”) of the process that currently holds the lock; this allows an easy way to automate sending a signal to the process that holds the lock.

What I can't find is a good reference on expected or “best practice” behaviour for handling PID files. There are various nuances: how to actually lock the file (don't bother? use the kernel? what about platform incompatibilities?), handling stale locks (silently delete them? when to check?), when exactly to acquire and release the lock, and so forth.

Where can I find a respected, most-authoritative reference (ideally on the level of W. Richard Stevens) for this small topic?

Locking Solutions


Solution 1 - Locking

First off, on all modern UNIXes /var/run does not persist across reboots.

The general method of handling the PID file is to create it during initialization and delete it from any exit, either normal or signal handler.

There are two canonical ways to atomically create/check for the file. The main one these days is to open it with the O_EXCL flag: if the file already exists, the call fails. The old way (mandatory on systems without O_EXCL) is to create it with a random name and link to it. The link will fail if the target exists.

Solution 2 - Locking

As far as I know, PID files are a convention rather than something that you can find a respected, mostly authoritative source for. The closest I could find is this section of the Filesystem Hierarchy Standard.

This Perl library might be helpful, since it looks like the author has at least given thought to some issues than can arise.

I believe that files under /var/run are often handled by the distro maintainers rather than daemons' authors, since it's the distro maintainers' responsibility to make sure that all of the init scripts play nice together. I checked Debian's and Fedora's developer documentation and couldn't find any detailed guidelines, but you might be able to get more info on their developers' mailing lists.

Solution 3 - Locking

See Kerrisk's The Linux Programming Interface, section 55.6 "Running Just One Instance of a Program" which is based on the pidfile implementation in Stevens' Unix Network Programming, v2.

Note also that the location of the pidfile is usually something handled by the distro (via an init script), so a well written daemon will take a command line argument to specify the pidfile and not allow this to be accidentally overridden by a configuration file. It should also gracefully handle a stale pid file by itself (O_EXCL should not be used). fcntl() file locking should be used--you may assume that a daemon's pidfile is located on a local (non-NFS) filesystem.

Solution 4 - Locking

Depending on the distribution, its actually the init script that handles the pidfile. It checks for existence at starting, removes when stopping, etc. I don't like doing it that way. I write my own init scripts and don't typically use the stanard init functions.

A well written program (daemon) will have some kind of configuration file saying where this pidfile (if any) should be written. It will also take care to establish signal handlers so that the PID file is cleaned up on normal, or abnormal exit, whenever a signal can be handled. The PID file then gives the init script the correct PID so it can be stopped.

Therefore, if the pidfile already exists when starting, its a very good indicator to the program that it previously crashed and should do some kind of recovery effort (if applicable). You kind of shoot that logic in the foot if you have the init script itself checking for the existence of the PID, or unlinking it.

As far as the name space, it should follow the program name. If you are starting 'foo-daemon', it would be foo-daemon.pid

You should also explore /var/lock/subsys, however that's used mostly on Red Hat flavors.

Solution 5 - Locking

The systemd package on Red Hat 7 provides a man page daemon(7) with the header line "Writing and packaging system daemons."

This man page discusses both "old style" (SysV) and "new style" (systemd) daemonization. In new style, systemd itself handles the PID files for you (if so configured to do so). However, in old style, the man page has this to say:

> 12. In the daemon process, write the daemon PID (as returned by getpid()) to a PID file, for example /run/foobar.pid (for a hypothetical daemon "foobar") to ensure that the daemon cannot be started more than once. This must be implemented in race-free fashion so that the PID file is only updated when it is verified at the same time that the PID previously stored in the PID file no longer exists or belongs to a foreign process.

You can also read this man page online.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionbignoseView Question on Stackoverflow
Solution 1 - LockingJoshuaView Answer on Stackoverflow
Solution 2 - LockingJosh KelleyView Answer on Stackoverflow
Solution 3 - LockingJohn HammondView Answer on Stackoverflow
Solution 4 - LockingTim PostView Answer on Stackoverflow
Solution 5 - LockingWildcardView Answer on Stackoverflow