s6
Software
skarnet.org
 The s6-supervise program 
s6-supervise monitors a long-lived process (or service), making sure it
stays alive, sending notifications to registered processes when it dies, and
providing an interface to control its state. s6-supervise is designed to be the
last non-leaf branch of a supervision tree, the supervised process
being a leaf.
 Interface 
     s6-supervise servicedir
 s6-supervise's behaviour is approximately the following:
 -  s6-supervise changes its current directory to servicedir. 
-  It exits 100 if another s6-supervise process is already monitoring this service. 
-  It forks and executes the ./run file in the service directory.
 
-  ./run should be a long-lived process: it can chain load (i.e. exec into
other binaries), but should not die. It's the daemon that s6-supervise monitors
and manages. 
-  When ./run dies, s6-supervise spawns ./finish, if it exists.
This script should be short-lived: it's meant to clean up application state, if
necessary, that has not been cleaned up by ./run itself before dying. 
-  When ./finish dies, s6-supervise spawns ./run again. 
-  s6-supervise operation can be controlled by the s6-svc
program. It can be sent commands like "restart the service", "bring the service down", etc. 
-  s6-supervise normally runs forever. If told to exit by s6-svc,
it waits for the service to go down one last time, then exits 0. 
 For a precise description of s6-supervise's behaviour, check the
Detailed operation section below, as well as
the service directory page:
s6-supervise operation can be extensively configured by the presence
of certain files in the service directory.
 Options 
 s6-supervise does not support options, because it is normally not run
manually via a command line; it is usually launched by its own
supervisor, s6-svscan. The way to
tune s6-supervise's behaviour is via files in the
service directory.
 Readiness notification support 
 If the service directory contains a valid
notification-fd file when the service is started, or restarted,
s6-supervise creates and listens to an additional pipe from the service
for readiness notification. When the
notification occurs, s6-supervise updates the ./supervise/status
file accordingly, then sends
a 'U' event to ./event.
 If the service is logged, i.e. if the service directory has a
log subdirectory that is also a service directory, and the
s6-supervise process has been launched by
that is also s6-svscan, then by default
the service's stdout goes into the logging pipe. If you set
notification-fd to 1, the logging pipe will be overwritten
by the notification pipe, which is probably not what you want. Instead,
if your daemon writes a notification message to its stdout, you should
set notification-fd to (for instance) 3, and redirect outputs
in your run script. For instance, to redirect stderr to the logger and
stdout to a notification-fd set to 3, you would start your
daemon as fdmove -c 2 1 fdmove 1 3 prog... (in execline), or
exec 2>&1 1>&3 3<&- prog... (in shell).
 Signals 
 s6-supervise reacts to the following signals:
 -  SIGTERM: bring down the service and exit, as if a
s6-svc -xd command had been received 
-  SIGHUP: close its own stdin and stdout, and exit as soon as the
service stops, as if an s6-svc -x command
had been received 
-  SIGQUIT: exit immediately without touching the service in any
way. 
-  SIGINT: send a SIGINT to the process group of the service, then
exit immediately. (The point here is to correctly forward SIGINT
in the case where s6-supervise is running in a terminal and the user
sent ^C to interrupt it.) 
 Detailed operation 
 -  s6-supervise switches to the servicedir
service directory. 
-  It creates a supervise/ subdirectory (if it doesn't exist yet) to
store its internal data. 
-  It exits 100 if another s6-supervise process is already monitoring this service. 
-  If the ./event fifodir does not exist,
s6-supervise creates it and allows subscriptions to it from processes having the same
effective group id as the s6-supervise process.
If it already exists, it uses it as is, without modifying the subscription rights. 
-  It sends a 's' event to ./event. 
-  If the default service state is up (i.e. there is no ./down file),
s6-supervise spawns ./run. One argument is given to the ./run
program: servicedir, the name of the directory s6-supervise is being
run on. It is given exactly as given to s6-supervise, without recanonicalization.
In particular, if s6-supervise is being managed by s6-svscan,
servicedir is always of the form foo or foo/log,
and foo contains no slashes. 
-  s6-supervise sends a 'u' event to ./event whenever it
successfully spawns ./run. 
-  If there is a ./notification-fd file in the service directory and,
at some point after the service has been spawned, s6-supervise is told that the
service is ready, it sends a 'U' event to ./event. There are
several ways to tell s6-supervise that the service is ready:
   
-  When ./run dies, s6-supervise sends a 'd' event to ./event.
It then spawns ./finish if it exists.
./finish will have ./run's exit code as first argument, or 256 if
./run was signaled; it will have the number of the signal that killed ./run
as second argument, or an undefined number if ./run was not signaled;
and it will have servicedir as third argument. 
-  By default, ./finish must exit in less than 5 seconds. If it takes more than that,
s6-supervise kills it with a SIGKILL. This can be configured via the
./timeout-finish file, see the description in the
service directory page. 
-  When ./finish dies (or is killed),
s6-supervise sends a 'D' event to ./event. Then
it restarts ./run unless it has been told not to. 
-  If ./finish exits 125, then s6-supervise sends a 'O' event
to ./event before the 'D', and it
does not restart the service, as if s6-svc -O had
been called. This can be used to signify permanent failure to start the service. 
-  There is a minimum 1-second delay between two ./run spawns, to avoid busylooping
if ./run exits too quickly. If the service has been ready for more
than one second, it will restart immediately, but if it is not ready when
it dies, s6-supervise will always pause for 1 second before spawning it again. 
-  When killed or asked to exit, it waits for the service to go down one last time, then
sends a 'x' event to ./event before exiting 0. 
 Make sure to also check the service directory
documentation page, for the full list of files that can be present in a service
directory and impact s6-supervise's behaviour in any way.
 Usage notes 
 -  s6-supervise is a long-lived process. It normally runs forever, from the system's
boot scripts, until shutdown time; it should not be killed or told to exit. If you have
no use for a service, just turn it off; the s6-supervise process does not hurt. 
-  Even in boot scripts, s6-supervise should normally not be run directly. It's
better to have a collection of service directories in a
single scan directory, and just run
s6-svscan on that scan directory. s6-svscan will spawn
the necessary s6-supervise processes, and will also take care of logged services. 
-  s6-supervise always spawns its child in a new session, as a session leader.
The goal is to protect the supervision tree from misbehaved services that would
send signals to their whole process group. Nevertheless, s6-supervise's handling of
SIGINT ensures that its service is killed if you happen to run it in a terminal and
send it a ^C. 
-  You can use s6-svc to send commands to the s6-supervise
process; mostly to change the service state and send signals to the monitored
process. 
-  You can use s6-svok to check whether s6-supervise
is successfully running. 
-  You can use s6-svstat to check the status of a
service. 
-  s6-supervise maintains internal information inside the ./supervise
subdirectory of servicedir. servicedir itself can be read-only,
but both servicedir/supervise and servicedir/event
need to be read-write. 
-  If servicedir isn't writable by s6-supervise, for any reason, then the
s6-svc -D and -U commands will not work
properly since s6-supervise will be unable to create or delete a
servicedir/down file; in this case s6-supervise will print a warning
on stderr, and perform the equivalent of -d or -u instead — it
will just be unable to change the permanent service configuration. 
 Implementation notes 
 -  s6-supervise tries its best to stay alive and running despite possible
system call failures. It will write to its standard error everytime it encounters a
problem. However, unlike s6-svscan, it will not go out
of its way to stay alive; if it encounters an unsolvable situation, it will just
die. 
-  Unlike other "supervise" implementations, s6-supervise is a fully asynchronous
state machine. That means that it can read and process commands at any time, even
when the machine is in trouble (full process table, for instance). 
-  s6-supervise does not use malloc(). That means it will never leak
memory. However, s6-supervise uses opendir(), and most opendir()
implementations internally use heap memory - so unfortunately, it's impossible to
guarantee that s6-supervise does not use heap memory at all. 
-  s6-supervise has been carefully designed so every instance maintains as little
data as possible, so it uses a very small
amount of non-sharable memory. It is not a problem to have several
dozens of s6-supervise processes, even on constrained systems: resource consumption
will be negligible.