systemd by example
Part 2: Dependencies
Series overview
This article is part of the series systemd by example. The following articles are available.
Part 2: Dependencies (this article)
Introduction
This is the second article in a series where I try to understand systemd by creating small containerized examples. In Part 1, we created a minimal systemd setup in a container. We are now using this setup to continue our investigation of systemd, starting with taking a closer look at dependencies.
Dependencies are fundamental to systemd. We already saw some of this in action in the first part: to shut down the system when we reach the halt target, we had to create a service which does the actual shutdown and add it as a dependency to the target. Similarly, we added journald as a dependency to the default target to ensure that it is started when the container starts. But there is a lot more to dependency management.
As I mentioned in the first part, the best way for me to get comfortable with new and potentially complex topics is through experimentation and examples. In this article, we’ll approach the topic of dependencies in this way, again by creating small examples in containers.
New: All the examples in this post are now available on systemd-by-example.com! (Learn more about this in systemd by example - The Playground)
The states of a unit
In order to understand how dependencies work, we first have to understand the different states of a unit. There are five states that a unit can be in: inactive
, activating
, activate
, deactivating
, or failed
. By default, units are inactive
. When systemd starts a unit, the state changes to activating
, and once startup finishes it is marked as active
. It then stays in this state until it stops, either by itself (for example, if the executable of a service terminates) or if it instructed by systemd to stop. The state then changes to deactivating
, followed by inactive
once shutdown finishes. And if anything goes wrong, the state is set to failed
.
These states exist for every unit, and they play an important role when systemd applies dependencies. For our experiments, we need a way to control these states in some way. One way to do this is through service units, like the following sleep.service
.
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
Let’s go through it line by line. With Type=oneshot
the service is only transitioning from activating
to active
once the command in ExecStart=
terminates; without this, the service would be marked as active
as soon as the ExecStart=
command is forked off. RemainAfterExit=true
specifies that the service is indeed marked as active
when the command terminates; if the value was false
or missing, the unit would immediately transition to deactivating
. Finally, the sleep 1
commands in ExecStart=
and ExecStop=
ensure that the unit stays in activating
or deactivating
for one second; by changing the sleep time we can control for how long the unit will be in the activating
or deactivating
state.
Let’s see the states in action. We use the container image from Part 1 as a basis, add sleep.service
to it, and start the container (see Part 1 for details on how to do this).
Note: Throughout the article, I will use the shell prompt $
to indicate a command executed on the host system
echo 'shell on the host'
and #
to indicate a command executed on the container
echo 'shell on the container'
To execute a command on the container, we can either start a shell on the container using
podman exec -it systemd /bin/bash
(where systemd
is the name of the container) and execute the command in this shell, or we can prefix the command with podman exec systemd
and execute it on the host system, as in
podman exec systemd echo 'shell on the container'
We can query the status of a unit with the systemctl
command. If we execute
systemctl status sleep.service
immediately after the container started, we get
● sleep.service
Loaded: loaded (/lib/systemd/system/sleep.service; static; vendor preset: enabled)
Active: inactive (dead)
The information we are looking for is in the Active:
row. As expected, the unit is inactive
. systemd doesn’t activate a unit unless being told to so. This can happen by either by explicitly activating the unit, or if the unit is a dependency (direct or indirect) of a unit that is activated. For now, let’s explicitly activate the unit and query the status immediately afterwards.
systemctl start sleep.service --no-block &&
systemctl status sleep.service
(--no-block
instructs systemctl
to return immediately and not wait for the unit to be activated); this produces
Active: activating (start) since Mon 2021-12-06 16:51:42 UTC; 207ms ago
(I’m omitting all lines except the Active:
one). So the unit transitioned from the inactive
into the activating
state. If we query the status again, we get
Active: active (exited) since Mon 2021-12-06 16:51:43 UTC; 21s ago
Going the other direction, we can stop the unit with the systemctl stop
command, and if we query the status immediately after stopping and then at least one second later we first get
Active: deactivating (stop) since Mon 2021-12-06 16:52:25 UTC; 243ms ago
and then
Active: inactive (dead)
So we went once through the circle of states and are back where we started. We can also see these transitions in journald.
journalctl --unit=sleep.service
shows
-- Logs begin at Mon 2021-12-06 16:51:11 UTC, end at Mon 2021-12-06 16:52:26 UTC. --
Dec 06 16:51:42 5d9adf02eef3 systemd[1]: Starting sleep.service...
Dec 06 16:51:43 5d9adf02eef3 systemd[1]: Finished sleep.service.
Dec 06 16:52:25 5d9adf02eef3 systemd[1]: Stopping sleep.service...
Dec 06 16:52:26 5d9adf02eef3 systemd[1]: sleep.service: Succeeded.
Dec 06 16:52:26 5d9adf02eef3 systemd[1]: Stopped sleep.service.
We started the service, and one second later it was finished (active
). Then we stopped it, and one second later it was actually marked as stopped (inactive
).
This service allowed us to see four of the unit states. This leaves the failed
state. We can get a unit in this state by extending our service slightly. A service of Type=oneshot
allows us to specify multiple ExecStart=
directives, which are executed one after the other. We keep the sleep 1
command to have a visible activating
state, but then also add a false
command to follow it (false
is a command from GNU coreutils; all it does is to exit with a status code 1
.)
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStart=false
ExecStop=sleep 1
If we start this service and query its status, we also first see the activating
state, but after one second we’ll get the output
Active: failed (Result: exit-code) since Mon 2021-12-06 16:53:33 UTC; 4s ago
Similarly, if we add ExecStop=false
instead, we get a unit that fails when it is stopped.
Now we have a way to control the different states of a unit, and we will use services like these throughout the rest of the article for our dependency experiments. Keep in mind though that dependencies can be defined between any unit types. We are just using services as a proxy; later we can translate the knowledge that we gain from these experiments to the general case.
Dependency types
systemd has two types of dependencies: requirement dependencies and ordering dependencies. Roughly speaking, the former specifies which other units should be started (or stopped) when activating a unit, and the latter specifies in which order to start them. When a unit is requested to be activated, systemd will use the requirement dependencies to compile a list of all dependencies that need to be started or stopped, and uses the ordering dependencies to determine the order in which the dependencies are started.
Requirement dependencies and ordering dependencies can be specified independently: a unit can require that another unit is activated alongside it without specifying an order of the startup, and vice versa. (The systemd man pages also say that the two types of dependencies are orthogonal. I think that’s slightly misleading, since ordering dependencies can change the behavior of requirement dependencies. But we’ll see that in more detail later.)
Ordering dependencies
Whenever systemd is instructed to activate two or more units, it uses ordering dependencies to decide which units to run first. By default, there is no ordering defined between two units, which results in the two units to be started in parallel. This is actually one of the key features of systemd, as described in Rethinking PID 1: by parallelizing unit activation whenever possible, we can achieve a faster bootup.
Let’s start with two services, following the template we created above.
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
If we start them with
systemctl start a.service b.service --no-block
and then look at the logs, we see
Dec 06 16:54:32 c650d2a4ddc2 systemd[1]: Starting a.service...
Dec 06 16:54:32 c650d2a4ddc2 systemd[1]: Starting b.service...
Dec 06 16:54:33 c650d2a4ddc2 systemd[1]: Finished a.service.
Dec 06 16:54:33 c650d2a4ddc2 systemd[1]: Finished b.service.
Both services were started in parallel, and then finished simultaneously one second later.
Next, let’s define an ordering dependency between the two units. There are two directives, Before=
and After=
, and they are complementary: defining Before=b.service
in a.service
is the same as defining After=a.service
in b.service
; in fact, systemd adds the second directive automatically (you can confirm this by executing systemctl show b.service
, which lists all directives of a unit, including the ones automatically generated by systemd).
If we change a.service
to look like
[Unit]
Before=b.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
and start both of them again, the logs show
Dec 06 16:55:24 0b5a08c4aea3 systemd[1]: Starting a.service...
Dec 06 16:55:25 0b5a08c4aea3 systemd[1]: Finished a.service.
Dec 06 16:55:25 0b5a08c4aea3 systemd[1]: Starting b.service...
Dec 06 16:55:26 0b5a08c4aea3 systemd[1]: Finished b.service.
So this time, b.service
only started after a.service
left the state activating
. Note that it is not necessary that a.service
transitions to active
. Even if we remove the RemainAfterExit=
directive (so that the service transitions directly from activating
to deactivating
) or if we add an ExecStart=false
directive (so that the service transitions from activating
to failed
), b.service
will be started.
When units are stopped, the order is reversed. Executing
systemctl stop b.service a.service --no-block
results in the logs
Dec 06 16:55:46 0b5a08c4aea3 systemd[1]: Stopping b.service...
Dec 06 16:55:47 0b5a08c4aea3 systemd[1]: b.service: Succeeded.
Dec 06 16:55:47 0b5a08c4aea3 systemd[1]: Stopped b.service.
Dec 06 16:55:47 0b5a08c4aea3 systemd[1]: Stopping a.service...
Dec 06 16:55:48 0b5a08c4aea3 systemd[1]: a.service: Succeeded.
Dec 06 16:55:48 0b5a08c4aea3 systemd[1]: Stopped a.service.
systemd first stops b.service
, and once it left the state deactivating
, it stops a.service
. (Note that we also reversed the order in the systemctl stop
call: we specify b.service
before a.service
. This is necessary, but the reason is a bit subtle and has to do with transactions and the run queue. The systemctl stop
command first creates a transaction to execute b.service
, which subsequently inserts a job to stop b.service
in the run queue. It then creates a second transaction to execute a.service
, which creates a job to stop a.service
. This job is queued after the stop job of b.service
. If we change the order of the parameters, then the transactions are reversed. By the time the stop job for b.service
is inserted into the run queue, the stop job for a.service
is already running; there is no way to put anything before it, so the two stop jobs are run in parallel.)
Ordering dependencies define a DAG on the units. When systemd activates the units, it sorts the DAG topologically to define the startup order. In our example, we have a very simple DAG with only two nodes.
But we can also construct more involved examples, like in the following DAG (each arrow indicates a Before=
relationship between the two nodes).
(You can also find the full example on GitHub.) The six services all follow the template of sleep.service
above, except that a.service
sleeps 3 seconds instead of one, and d.service
sleeps 5 seconds.
If we start those six services, we expect a.service
and b.services
to be started in parallel. After one second, b.service
will finish. This allows d.service
to start, but c.service
still needs to wait for a.service
, which takes two more seconds. Another second, and c.service
finishes and e.service
starts, which also takes a second. Now f.service
still needs to wait for d.service
to complete, after which it can run. And indeed, that’s happening according to the logs.
Dec 06 16:56:27 8d36e8af8fd3 systemd[1]: Starting a.service...
Dec 06 16:56:27 8d36e8af8fd3 systemd[1]: Starting b.service...
Dec 06 16:56:28 8d36e8af8fd3 systemd[1]: Finished b.service.
Dec 06 16:56:28 8d36e8af8fd3 systemd[1]: Starting d.service...
Dec 06 16:56:30 8d36e8af8fd3 systemd[1]: Finished a.service.
Dec 06 16:56:30 8d36e8af8fd3 systemd[1]: Starting c.service...
Dec 06 16:56:31 8d36e8af8fd3 systemd[1]: Finished c.service.
Dec 06 16:56:31 8d36e8af8fd3 systemd[1]: Starting e.service...
Dec 06 16:56:32 8d36e8af8fd3 systemd[1]: Finished e.service.
Dec 06 16:56:33 8d36e8af8fd3 systemd[1]: Finished d.service.
Dec 06 16:56:33 8d36e8af8fd3 systemd[1]: Starting f.service...
Dec 06 16:56:34 8d36e8af8fd3 systemd[1]: Finished f.service.
This already shows the power of dependencies. If we had started the services sequentially one after another, it would take 12 seconds until the last service was finished. Here, we only needed 7 seconds, while still ensuring that some services are only started when others have finished.
Real life examples
That’s the theory behind ordering dependencies; now let’s look at some real world examples.
By default, journald keeps the journal in memory. That means that when a system reboots, all logs are lost. To avoid this, journald also has an option to write all previous and upcoming log messages to persistent storage. But since journald is used during boot-up, we cannot activate this option by default: journald is started even before persistent storage is available. The way systemd solves this is by having two service units. First systemd-journald.service
which starts journald early in the boot process; and second systemd-journald-flush.service
, which flushes the existing journal to persistent storage and also redirects all future logs there. But flushing only makes sense when journald is already running, so systemd-journald-flush.service
has an After=
dependency on systemd-journald.service
. It also requires that there is actually persistent storage available, so it also has an After=
dependency on systemd-remount-fs.service
, which takes care of mounting the root and kernel file systems.
For another example, we’ll take a look at system shutdown and the importance that ordering dependencies are reverted when units are deactivated. systemd has a special network.target
, which indicates (that is, becomes active
) when network functionality is available during bootup. One part of network functionality is working network name resolution, that is, translating domain names into IP addresses; systemd provides this name resolution through the binary systemd-resolved
. This binary is started through a systemd-resolved.service
unit, which has a Before=
requirement on network.target
. This ensures that when network.target
is marked as active, name resolution is also active. Conversely, when the system is shut down and network.target
is deactivated, systemd-resolved
is stopped after network.target
is marked inactive. So if we have some service that needs network access when the system is shut down (for example, the service might want to notify a remote service to let it know that it will not be available anymore), we can add an After=
dependency on network.target
. When at system shutdown all units are deactivated, this After=
dependency is converted into a before relationship, meaning that our service is deactivated before network.target
is deactivated. Similarly, the dependency of systemd-resolved.service
on network.target
is converted into an after relationship, meaning that systemd-resolved.service
deactivates after network.target
deactivates. This ensures that we have working name resolution throughout our service shutdown. (Ensuring network functionality during shutdown is actually the primary reason for the existence of network.target
. To ensure that a unit is only activated after the network is fully operational, there is another unit network-online.target
. See Running Services After the Network is up for a more detailed discussion.)
Requirement dependencies
Requirement dependencies are not as clear-cut as ordering dependencies. There are six different directives to declare requirement dependencies, all with different behavior. We will take a look at three of them, Wants=
, Requires=
, and Conflicts=
, since they are the most commonly used ones. For more details on the other three, Requisite=
, BindsTo=
, and PartOf=
see the systemd.unit
man page.
Wants=
The simplest of the requirement dependencies is Wants=
. If it is defined in a unit file and this unit is started, any unit on the right-hand side that is currently inactive is started as well.
Let’s look at an example. We start with two services again. b.service
(not shown here) is a simple copy of sleep.service
, and a.service
has an additional Wants=
directive.
[Unit]
Wants=b.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
If we start a.service
through
systemctl start a.service --no-block
then b.service
is started as well
Dec 06 16:57:17 9559906107e8 systemd[1]: Starting a.service...
Dec 06 16:57:17 9559906107e8 systemd[1]: Starting b.service...
Dec 06 16:57:18 9559906107e8 systemd[1]: Finished a.service.
Dec 06 16:57:18 9559906107e8 systemd[1]: Finished b.service.
Note that, since we didn’t specify any ordering dependency, the services are started in parallel. If we add for example After=b.service
, then a.service
is only started once b.service
finishes activating.
Wants=
is the least needy of all the requirement dependencies. It operates on a best effort basis in that it tries to start all units that are wanted. But it doesn’t require that these units successfully activate, or even that they exist at all. This is why the systemd docs recommend using this requirement dependency if possible; it makes the system more resilient against failures if the start-up of one unit is not strictly dependent on the start-up of another.
If any of the wanted units have requirement dependencies of their own, those units are activated as well, and so are their requirement dependencies, and so on. Let’s look at the DAG from above again, but now we also add a couple of Wants=
directives, symbolized by the blue arrows (the full example is on GitHub).
If we only start f.service
with
systemctl start f.service --no-block
then systemd sees that f.service
wants e.service
and d.service
, so those are started as well. But e.service
wants c.service
, so that’s added to the list, and since c.service
wants a.service
and b.service
, those are also added. In the end, all six services are started, and since we have additional ordering dependencies defined between them, they are started in the same order as above.
Note that if we only start e.service
with
systemctl start e.service --no-block
then only a.service
, b.service
, c.service
, and e.service
are started, since they are connected through requirement dependencies. There are also ordering dependencies to d.service
and f.service
, but since they are not required by any of the units, they are not included in the list of services to activate. (Remember that requirement dependencies define what to start, and ordering dependencies define when to start it.)
Requires=
The next requirement dependency on our list is Requires=
. Similarly to Wants=
, any units on the right-hand side are activated whenever the defining unit is activated. But additionally, when there is also an After=
dependency on the right hand side unit, it must finish activating successfully, otherwise the defining unit will not be started. For example, define a.service
as follows, and let b.service
be a copy of fail-start.service
defined above.
[Unit]
Requires=b.service
After=b.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
Here are the logs after starting a.service
.
Dec 06 16:57:53 5a5ba27212eb systemd[1]: Starting b.service...
Dec 06 16:57:54 5a5ba27212eb systemd[1]: b.service: Main process exited, code=exited, status=1/FAILURE
Dec 06 16:57:54 5a5ba27212eb systemd[1]: b.service: Failed with result 'exit-code'.
Dec 06 16:57:54 5a5ba27212eb systemd[1]: Failed to start b.service.
Dec 06 16:57:54 5a5ba27212eb systemd[1]: Dependency failed for a.service.
Dec 06 16:57:54 5a5ba27212eb systemd[1]: a.service: Job a.service/start failed with result 'dependency'.
As we expect, starting a.service
pulled in b.service
; since we defined the ordering dependency After=b.service
, a.service
waits for b.service
to start. But b.service
fails, and a.service
has a Requires=b.service
, so a.service
is not started.
Note that the ordering dependency is important. Without it, the two units would be started in parallel, so if b.service
fails, a.service
would already be started.
Another feature of Requires=
is that when a unit on the right-hand side is explicitly stopped (for example through systemctl stop
), then the defining unit is also stopped. But the right-hand side has to be stopped explicitly; if it deactivates on its own, for example, if it transitions directly from activating
to deactivating
, then the defining unit will not be deactivated. (As an exercise, see if you can create two examples which show this behavior.)
Conflicts=
The final requirement dependency we are taking a look at is Conflicts=
. This is a negative dependency: whatever is on the right-hand side cannot be active when the defining unit is active. As usual, we let b.service
be a copy of sleep.service
, and we define a.service
as follows.
[Unit]
Conflicts=b.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
We first start a.service
, and once it’s active, we start b.service
. Here are the journald logs starting from the point where we start b.service
.
Dec 06 16:58:45 2abbf032aaff systemd[1]: Stopping a.service...
Dec 06 16:58:45 2abbf032aaff systemd[1]: Starting b.service...
Dec 06 16:58:46 2abbf032aaff systemd[1]: a.service: Succeeded.
Dec 06 16:58:46 2abbf032aaff systemd[1]: Stopped a.service.
Dec 06 16:58:46 2abbf032aaff systemd[1]: Finished b.service.
Before starting b.service
, systemd automatically stops a.service
. We can now try to start a.service
again, and we’ll see that b.service
is stopped. The two units can never be active at the same time.
Note that it doesn’t wait for a.service
to be stopped. Immediately after sending a stop signal to a.service
it starts b.service
, so that the shutdown and start-up effectively happen in parallel. If we want a.service
to fully stop before b.service
starts, we need to define an ordering dependency as well. It doesn’t matter whether we define a Before=
or an After=
dependency: a shutdown is always ordered before a start-up.
Let’s look at one final example which ties together all requirement dependencies that we have seen so far.
The blue nodes are copies of sleep.service
, while the red node of b.service
is a copy of fail-after-sleep.service
. The arrows denote the ordering and requirement dependencies defined between the different units. (Again, the full example is on GitHub.)
What will happen if we start d.service
? Since d.service
requires e.service
, those two services will be started, and since there is no ordering dependency defined between them, they are started in parallel.
Next, what will happen if we start a.service
(with d.service
and e.service
still active)? First of all, a.service
requires b.service
, so that one will be started; b.service
wants c.service
, so that’s added to the queue as well. Now c.service
conflicts e.service
, so e.service
will be stopped, and since d.service
requires e.service
, it will also be stopped. This clarifies which units are started or stopped; next we need to determine the order in which this happens. There is no ordering dependency defined between d.service
and e.service
, so they can be stopped in parallel. Similarly, there is no ordering dependency defined between b.service
and c.service
, so it seems like they could be started in parallel. However, there is an ordering dependency defined between c.service
and e.service
, so c.service
will not start before e.service
was stopped. So in the end, d.service
and e.service
will stop in parallel, while b.service
starts (and then fails). c.service
waits for e.service
to stop and only starts afterwards. And finally, a.service
, even though it was responsible for kicking off this whole cascade, won’t activate at all since it requires b.service
with an After=
dependency and b.service
failed.
Real life examples
Finally, let’s try to see how these requirement dependencies are used in real life.
We already saw that systemd-journald-flush.service
has an After=
dependency on systemd-journald.service
. It also has a Requires=
dependency on it, so that it won’t run unless journald is already running.
Next, let’s look at system bootup. sysinit.target
is a special systemd unit. It has Wants=
dependencies on a lot of units needed for system initialization; for example, a dependency on systemd-journald.service
which starts journald; or a dependency on systemd-modules-load.service
which is responsible for loading kernel modules. Note that it makes sense to use a Wants=
dependency here. sysinit.target
is a fundamental target during bootup, and in fact every service unit automatically has Requires=
and After=
dependencies on it (unless turned off with DefaultDependencies=no
). So even if for example journald did not start successfully, we still want the bootup to continue. Otherwise, we could not even fix the problem. (If you check these examples yourself, you’ll notice that there is actually no Wants=systemd-journald.service
line in sysinit.target
. Instead in /lib/systemd/system
, there is a subdirectory sysinit.target.wants
, and in it, a symlink to systemd-journald.service
. This is an alternative way to specify the requirement dependencies, but it only works for Wants=
and Requires=
dependencies. We’ll see this in more detail in the next part of this series.)
Finally, let’s take a look at system shutdown. halt.target
has Requires=
and After=
dependencies on systemd-halt.service
which does the actual shutdown (we did something similar for our minimal systemd setup in the previous part) systemd-halt.service
in turn has Requires=
and After=
dependencies on shutdown.target
. By default, every service unit gets automatic Conflicts=
and Before=
dependencies on shutdown.target
.
This means that when halt.target
is activated, this pulls in systemd-halt.service
and in turn shutdown.target
. The Conflicts=
dependencies cause all services to shut down. The ordering dependencies ensure that first all services are shut down, then systemd-halt.service
is executed, and then halt.target
is marked as active.
Conclusion
systemd’s dependency system is quite elaborate and can be overwhelming, especially since some dependencies are automatically defined by systemd, and since there are different ways to define dependencies. In my opinion, the best way to deal with this is to look at many examples and to experiment with the features, trying out their default behavior and also their edge cases. In this article, we have seen some ways to do this for the most common dependencies. As always, you learn a lot more by trying things yourself than by just reading an article, so I’d encourage you to conduct your own experiments. (If you need inspiration what to do, you could create examples for the three requirement dependencies that we didn’t cover here; our you could introduce cycles in the dependency graph so that it’s no longer a DAG and see how systemd deals with this.)
So far, we have used service units mostly as a tool to control the states of a unit. In the next part of this series, I’m planning to take a closer look at service units and their real world usage.
—Written by Sebastian Jambor. Follow me on Mastodon @crepels@mastodon.social for updates on new blog posts.