December 7, 2021

systemd by example

Part 2: Dependencies

Series overview

This article is part of the series systemd by example. The following articles are available.

Introduction

This is the second article in a series where I try to understand systemd by creating small containerized examples. In Part 1, we created a minimal systemd setup in a container. We are now using this setup to continue our investigation of systemd, starting with taking a closer look at dependencies.

Dependencies are fundamental to systemd. We already saw some of this in action in the first part: to shut down the system when we reach the halt target, we had to create a service which does the actual shutdown and add it as a dependency to the target. Similarly, we added journald as a dependency to the default target to ensure that it is started when the container starts. But there is a lot more to dependency management.

As I mentioned in the first part, the best way for me to get comfortable with new and potentially complex topics is through experimentation and examples. In this article, we’ll approach the topic of dependencies in this way, again by creating small examples in containers.

New: All the examples in this post are now available on systemd-by-example.com! (Learn more about this in systemd by example - The Playground)

The states of a unit

In order to understand how dependencies work, we first have to understand the different states of a unit. There are five states that a unit can be in: inactive, activating, activate, deactivating, or failed. By default, units are inactive. When systemd starts a unit, the state changes to activating, and once startup finishes it is marked as active. It then stays in this state until it stops, either by itself (for example, if the executable of a service terminates) or if it instructed by systemd to stop. The state then changes to deactivating, followed by inactive once shutdown finishes. And if anything goes wrong, the state is set to failed.

A diagram showing the five unit states and how they are connected

These states exist for every unit, and they play an important role when systemd applies dependencies. For our experiments, we need a way to control these states in some way. One way to do this is through service units, like the following sleep.service.

Follow along on the systemd playground Example 1: The example services
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
sleep.service

Let’s go through it line by line. With Type=oneshot the service is only transitioning from activating to active once the command in ExecStart= terminates; without this, the service would be marked as active as soon as the ExecStart= command is forked off. RemainAfterExit=true specifies that the service is indeed marked as active when the command terminates; if the value was false or missing, the unit would immediately transition to deactivating. Finally, the sleep 1 commands in ExecStart= and ExecStop= ensure that the unit stays in activating or deactivating for one second; by changing the sleep time we can control for how long the unit will be in the activating or deactivating state.

Let’s see the states in action. We use the container image from Part 1 as a basis, add sleep.service to it, and start the container (see Part 1 for details on how to do this).

Note: Throughout the article, I will use the shell prompt $ to indicate a command executed on the host system

echo 'shell on the host'

and # to indicate a command executed on the container

echo 'shell on the container'

To execute a command on the container, we can either start a shell on the container using

podman exec -it systemd /bin/bash

(where systemd is the name of the container) and execute the command in this shell, or we can prefix the command with podman exec systemd and execute it on the host system, as in

podman exec systemd echo 'shell on the container'

We can query the status of a unit with the systemctl command. If we execute

systemctl status sleep.service

immediately after the container started, we get

● sleep.service
     Loaded: loaded (/lib/systemd/system/sleep.service; static; vendor preset: enabled)
     Active: inactive (dead)

The information we are looking for is in the Active: row. As expected, the unit is inactive. systemd doesn’t activate a unit unless being told to so. This can happen by either by explicitly activating the unit, or if the unit is a dependency (direct or indirect) of a unit that is activated. For now, let’s explicitly activate the unit and query the status immediately afterwards.

systemctl start sleep.service --no-block &&
    systemctl status sleep.service

(--no-block instructs systemctl to return immediately and not wait for the unit to be activated); this produces

Active: activating (start) since Mon 2021-12-06 16:51:42 UTC; 207ms ago

(I’m omitting all lines except the Active: one). So the unit transitioned from the inactive into the activating state. If we query the status again, we get

Active: active (exited) since Mon 2021-12-06 16:51:43 UTC; 21s ago

Going the other direction, we can stop the unit with the systemctl stop command, and if we query the status immediately after stopping and then at least one second later we first get

Active: deactivating (stop) since Mon 2021-12-06 16:52:25 UTC; 243ms ago

and then

Active: inactive (dead)

So we went once through the circle of states and are back where we started. We can also see these transitions in journald.

journalctl --unit=sleep.service

shows

-- Logs begin at Mon 2021-12-06 16:51:11 UTC, end at Mon 2021-12-06 16:52:26 UTC. --
Dec 06 16:51:42 5d9adf02eef3 systemd[1]: Starting sleep.service...
Dec 06 16:51:43 5d9adf02eef3 systemd[1]: Finished sleep.service.
Dec 06 16:52:25 5d9adf02eef3 systemd[1]: Stopping sleep.service...
Dec 06 16:52:26 5d9adf02eef3 systemd[1]: sleep.service: Succeeded.
Dec 06 16:52:26 5d9adf02eef3 systemd[1]: Stopped sleep.service.

We started the service, and one second later it was finished (active). Then we stopped it, and one second later it was actually marked as stopped (inactive).

This service allowed us to see four of the unit states. This leaves the failed state. We can get a unit in this state by extending our service slightly. A service of Type=oneshot allows us to specify multiple ExecStart= directives, which are executed one after the other. We keep the sleep 1 command to have a visible activating state, but then also add a false command to follow it (false is a command from GNU coreutils; all it does is to exit with a status code 1.)

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStart=false
ExecStop=sleep 1
fail-start.service

If we start this service and query its status, we also first see the activating state, but after one second we’ll get the output

Active: failed (Result: exit-code) since Mon 2021-12-06 16:53:33 UTC; 4s ago

Similarly, if we add ExecStop=false instead, we get a unit that fails when it is stopped.

Now we have a way to control the different states of a unit, and we will use services like these throughout the rest of the article for our dependency experiments. Keep in mind though that dependencies can be defined between any unit types. We are just using services as a proxy; later we can translate the knowledge that we gain from these experiments to the general case.

Dependency types

systemd has two types of dependencies: requirement dependencies and ordering dependencies. Roughly speaking, the former specifies which other units should be started (or stopped) when activating a unit, and the latter specifies in which order to start them. When a unit is requested to be activated, systemd will use the requirement dependencies to compile a list of all dependencies that need to be started or stopped, and uses the ordering dependencies to determine the order in which the dependencies are started.

Requirement dependencies and ordering dependencies can be specified independently: a unit can require that another unit is activated alongside it without specifying an order of the startup, and vice versa. (The systemd man pages also say that the two types of dependencies are orthogonal. I think that’s slightly misleading, since ordering dependencies can change the behavior of requirement dependencies. But we’ll see that in more detail later.)

Ordering dependencies

Whenever systemd is instructed to activate two or more units, it uses ordering dependencies to decide which units to run first. By default, there is no ordering defined between two units, which results in the two units to be started in parallel. This is actually one of the key features of systemd, as described in Rethinking PID 1: by parallelizing unit activation whenever possible, we can achieve a faster bootup.

Let’s start with two services, following the template we created above.

Follow along on the systemd playground Example 2: Two services without ordering dependencies
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
a.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
b.service

If we start them with

systemctl start a.service b.service --no-block

and then look at the logs, we see

Dec 06 16:54:32 c650d2a4ddc2 systemd[1]: Starting a.service...
Dec 06 16:54:32 c650d2a4ddc2 systemd[1]: Starting b.service...
Dec 06 16:54:33 c650d2a4ddc2 systemd[1]: Finished a.service.
Dec 06 16:54:33 c650d2a4ddc2 systemd[1]: Finished b.service.

Both services were started in parallel, and then finished simultaneously one second later.

Next, let’s define an ordering dependency between the two units. There are two directives, Before= and After=, and they are complementary: defining Before=b.service in a.service is the same as defining After=a.service in b.service; in fact, systemd adds the second directive automatically (you can confirm this by executing systemctl show b.service, which lists all directives of a unit, including the ones automatically generated by systemd).

Follow along on the systemd playground Example 3: Two services with an ordering dependency

If we change a.service to look like

[Unit]
Before=b.service

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
a.service

and start both of them again, the logs show

Dec 06 16:55:24 0b5a08c4aea3 systemd[1]: Starting a.service...
Dec 06 16:55:25 0b5a08c4aea3 systemd[1]: Finished a.service.
Dec 06 16:55:25 0b5a08c4aea3 systemd[1]: Starting b.service...
Dec 06 16:55:26 0b5a08c4aea3 systemd[1]: Finished b.service.

So this time, b.service only started after a.service left the state activating. Note that it is not necessary that a.service transitions to active. Even if we remove the RemainAfterExit= directive (so that the service transitions directly from activating to deactivating) or if we add an ExecStart=false directive (so that the service transitions from activating to failed), b.service will be started.

When units are stopped, the order is reversed. Executing

systemctl stop b.service a.service --no-block

results in the logs

Dec 06 16:55:46 0b5a08c4aea3 systemd[1]: Stopping b.service...
Dec 06 16:55:47 0b5a08c4aea3 systemd[1]: b.service: Succeeded.
Dec 06 16:55:47 0b5a08c4aea3 systemd[1]: Stopped b.service.
Dec 06 16:55:47 0b5a08c4aea3 systemd[1]: Stopping a.service...
Dec 06 16:55:48 0b5a08c4aea3 systemd[1]: a.service: Succeeded.
Dec 06 16:55:48 0b5a08c4aea3 systemd[1]: Stopped a.service.

systemd first stops b.service, and once it left the state deactivating, it stops a.service. (Note that we also reversed the order in the systemctl stop call: we specify b.service before a.service. This is necessary, but the reason is a bit subtle and has to do with transactions and the run queue. The systemctl stop command first creates a transaction to execute b.service, which subsequently inserts a job to stop b.service in the run queue. It then creates a second transaction to execute a.service, which creates a job to stop a.service. This job is queued after the stop job of b.service. If we change the order of the parameters, then the transactions are reversed. By the time the stop job for b.service is inserted into the run queue, the stop job for a.service is already running; there is no way to put anything before it, so the two stop jobs are run in parallel.)

Ordering dependencies define a DAG on the units. When systemd activates the units, it sorts the DAG topologically to define the startup order. In our example, we have a very simple DAG with only two nodes.

A DAG with two nodes and one edge

But we can also construct more involved examples, like in the following DAG (each arrow indicates a Before= relationship between the two nodes).

A DAG with six nodes and various edges between them
Follow along on the systemd playground Example 4: Multiple units with ordering dependencies

(You can also find the full example on GitHub.) The six services all follow the template of sleep.service above, except that a.service sleeps 3 seconds instead of one, and d.service sleeps 5 seconds.

If we start those six services, we expect a.service and b.services to be started in parallel. After one second, b.service will finish. This allows d.service to start, but c.service still needs to wait for a.service, which takes two more seconds. Another second, and c.service finishes and e.service starts, which also takes a second. Now f.service still needs to wait for d.service to complete, after which it can run. And indeed, that’s happening according to the logs.

Dec 06 16:56:27 8d36e8af8fd3 systemd[1]: Starting a.service...
Dec 06 16:56:27 8d36e8af8fd3 systemd[1]: Starting b.service...
Dec 06 16:56:28 8d36e8af8fd3 systemd[1]: Finished b.service.
Dec 06 16:56:28 8d36e8af8fd3 systemd[1]: Starting d.service...
Dec 06 16:56:30 8d36e8af8fd3 systemd[1]: Finished a.service.
Dec 06 16:56:30 8d36e8af8fd3 systemd[1]: Starting c.service...
Dec 06 16:56:31 8d36e8af8fd3 systemd[1]: Finished c.service.
Dec 06 16:56:31 8d36e8af8fd3 systemd[1]: Starting e.service...
Dec 06 16:56:32 8d36e8af8fd3 systemd[1]: Finished e.service.
Dec 06 16:56:33 8d36e8af8fd3 systemd[1]: Finished d.service.
Dec 06 16:56:33 8d36e8af8fd3 systemd[1]: Starting f.service...
Dec 06 16:56:34 8d36e8af8fd3 systemd[1]: Finished f.service.

This already shows the power of dependencies. If we had started the services sequentially one after another, it would take 12 seconds until the last service was finished. Here, we only needed 7 seconds, while still ensuring that some services are only started when others have finished.

Real life examples

That’s the theory behind ordering dependencies; now let’s look at some real world examples.

By default, journald keeps the journal in memory. That means that when a system reboots, all logs are lost. To avoid this, journald also has an option to write all previous and upcoming log messages to persistent storage. But since journald is used during boot-up, we cannot activate this option by default: journald is started even before persistent storage is available. The way systemd solves this is by having two service units. First systemd-journald.service which starts journald early in the boot process; and second systemd-journald-flush.service, which flushes the existing journal to persistent storage and also redirects all future logs there. But flushing only makes sense when journald is already running, so systemd-journald-flush.service has an After= dependency on systemd-journald.service. It also requires that there is actually persistent storage available, so it also has an After= dependency on systemd-remount-fs.service, which takes care of mounting the root and kernel file systems.

For another example, we’ll take a look at system shutdown and the importance that ordering dependencies are reverted when units are deactivated. systemd has a special network.target, which indicates (that is, becomes active) when network functionality is available during bootup. One part of network functionality is working network name resolution, that is, translating domain names into IP addresses; systemd provides this name resolution through the binary systemd-resolved. This binary is started through a systemd-resolved.service unit, which has a Before= requirement on network.target. This ensures that when network.target is marked as active, name resolution is also active. Conversely, when the system is shut down and network.target is deactivated, systemd-resolved is stopped after network.target is marked inactive. So if we have some service that needs network access when the system is shut down (for example, the service might want to notify a remote service to let it know that it will not be available anymore), we can add an After= dependency on network.target. When at system shutdown all units are deactivated, this After= dependency is converted into a before relationship, meaning that our service is deactivated before network.target is deactivated. Similarly, the dependency of systemd-resolved.service on network.target is converted into an after relationship, meaning that systemd-resolved.service deactivates after network.target deactivates. This ensures that we have working name resolution throughout our service shutdown. (Ensuring network functionality during shutdown is actually the primary reason for the existence of network.target. To ensure that a unit is only activated after the network is fully operational, there is another unit network-online.target. See Running Services After the Network is up for a more detailed discussion.)

Requirement dependencies

Requirement dependencies are not as clear-cut as ordering dependencies. There are six different directives to declare requirement dependencies, all with different behavior. We will take a look at three of them, Wants=, Requires=, and Conflicts=, since they are the most commonly used ones. For more details on the other three, Requisite=, BindsTo=, and PartOf= see the systemd.unit man page.

Wants=

The simplest of the requirement dependencies is Wants=. If it is defined in a unit file and this unit is started, any unit on the right-hand side that is currently inactive is started as well.

Let’s look at an example. We start with two services again. b.service (not shown here) is a simple copy of sleep.service, and a.service has an additional Wants= directive.

Follow along on the systemd playground Example 5: Two services with a requirement dependency
[Unit]
Wants=b.service

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
a.service

If we start a.service through

systemctl start a.service --no-block

then b.service is started as well

Dec 06 16:57:17 9559906107e8 systemd[1]: Starting a.service...
Dec 06 16:57:17 9559906107e8 systemd[1]: Starting b.service...
Dec 06 16:57:18 9559906107e8 systemd[1]: Finished a.service.
Dec 06 16:57:18 9559906107e8 systemd[1]: Finished b.service.

Note that, since we didn’t specify any ordering dependency, the services are started in parallel. If we add for example After=b.service, then a.service is only started once b.service finishes activating.

Wants= is the least needy of all the requirement dependencies. It operates on a best effort basis in that it tries to start all units that are wanted. But it doesn’t require that these units successfully activate, or even that they exist at all. This is why the systemd docs recommend using this requirement dependency if possible; it makes the system more resilient against failures if the start-up of one unit is not strictly dependent on the start-up of another.

If any of the wanted units have requirement dependencies of their own, those units are activated as well, and so are their requirement dependencies, and so on. Let’s look at the DAG from above again, but now we also add a couple of Wants= directives, symbolized by the blue arrows (the full example is on GitHub).

A DAG with six nodes and various Before= and Wants= edges between them

If we only start f.service with

systemctl start f.service --no-block

then systemd sees that f.service wants e.service and d.service, so those are started as well. But e.service wants c.service, so that’s added to the list, and since c.service wants a.service and b.service, those are also added. In the end, all six services are started, and since we have additional ordering dependencies defined between them, they are started in the same order as above.

Note that if we only start e.service with

systemctl start e.service --no-block

then only a.service, b.service, c.service, and e.service are started, since they are connected through requirement dependencies. There are also ordering dependencies to d.service and f.service, but since they are not required by any of the units, they are not included in the list of services to activate. (Remember that requirement dependencies define what to start, and ordering dependencies define when to start it.)

Requires=

The next requirement dependency on our list is Requires=. Similarly to Wants=, any units on the right-hand side are activated whenever the defining unit is activated. But additionally, when there is also an After= dependency on the right hand side unit, it must finish activating successfully, otherwise the defining unit will not be started. For example, define a.service as follows, and let b.service be a copy of fail-start.service defined above.

Follow along on the systemd playground Example 7: Requiring a failing service
[Unit]
Requires=b.service
After=b.service

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
a.service

Here are the logs after starting a.service.

Dec 06 16:57:53 5a5ba27212eb systemd[1]: Starting b.service...
Dec 06 16:57:54 5a5ba27212eb systemd[1]: b.service: Main process exited, code=exited, status=1/FAILURE
Dec 06 16:57:54 5a5ba27212eb systemd[1]: b.service: Failed with result 'exit-code'.
Dec 06 16:57:54 5a5ba27212eb systemd[1]: Failed to start b.service.
Dec 06 16:57:54 5a5ba27212eb systemd[1]: Dependency failed for a.service.
Dec 06 16:57:54 5a5ba27212eb systemd[1]: a.service: Job a.service/start failed with result 'dependency'.

As we expect, starting a.service pulled in b.service; since we defined the ordering dependency After=b.service, a.service waits for b.service to start. But b.service fails, and a.service has a Requires=b.service, so a.service is not started.

Note that the ordering dependency is important. Without it, the two units would be started in parallel, so if b.service fails, a.service would already be started.

Another feature of Requires= is that when a unit on the right-hand side is explicitly stopped (for example through systemctl stop), then the defining unit is also stopped. But the right-hand side has to be stopped explicitly; if it deactivates on its own, for example, if it transitions directly from activating to deactivating, then the defining unit will not be deactivated. (As an exercise, see if you can create two examples which show this behavior.)

Conflicts=

The final requirement dependency we are taking a look at is Conflicts=. This is a negative dependency: whatever is on the right-hand side cannot be active when the defining unit is active. As usual, we let b.service be a copy of sleep.service, and we define a.service as follows.

Follow along on the systemd playground Example 8: Two conflicting services
[Unit]
Conflicts=b.service

[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=sleep 1
ExecStop=sleep 1
a.service

We first start a.service, and once it’s active, we start b.service. Here are the journald logs starting from the point where we start b.service.

Dec 06 16:58:45 2abbf032aaff systemd[1]: Stopping a.service...
Dec 06 16:58:45 2abbf032aaff systemd[1]: Starting b.service...
Dec 06 16:58:46 2abbf032aaff systemd[1]: a.service: Succeeded.
Dec 06 16:58:46 2abbf032aaff systemd[1]: Stopped a.service.
Dec 06 16:58:46 2abbf032aaff systemd[1]: Finished b.service.

Before starting b.service, systemd automatically stops a.service. We can now try to start a.service again, and we’ll see that b.service is stopped. The two units can never be active at the same time.

Note that it doesn’t wait for a.service to be stopped. Immediately after sending a stop signal to a.service it starts b.service, so that the shutdown and start-up effectively happen in parallel. If we want a.service to fully stop before b.service starts, we need to define an ordering dependency as well. It doesn’t matter whether we define a Before= or an After= dependency: a shutdown is always ordered before a start-up.

Let’s look at one final example which ties together all requirement dependencies that we have seen so far.

Follow along on the systemd playground Example 9: All requirement dependencies
A DAG with five notes, connected through Wants=, Requires=, and Conflicts= dependencies

The blue nodes are copies of sleep.service, while the red node of b.service is a copy of fail-after-sleep.service. The arrows denote the ordering and requirement dependencies defined between the different units. (Again, the full example is on GitHub.)

What will happen if we start d.service? Since d.service requires e.service, those two services will be started, and since there is no ordering dependency defined between them, they are started in parallel.

Next, what will happen if we start a.service (with d.service and e.service still active)? First of all, a.service requires b.service, so that one will be started; b.service wants c.service, so that’s added to the queue as well. Now c.service conflicts e.service, so e.service will be stopped, and since d.service requires e.service, it will also be stopped. This clarifies which units are started or stopped; next we need to determine the order in which this happens. There is no ordering dependency defined between d.service and e.service, so they can be stopped in parallel. Similarly, there is no ordering dependency defined between b.service and c.service, so it seems like they could be started in parallel. However, there is an ordering dependency defined between c.service and e.service, so c.service will not start before e.service was stopped. So in the end, d.service and e.service will stop in parallel, while b.service starts (and then fails). c.service waits for e.service to stop and only starts afterwards. And finally, a.service, even though it was responsible for kicking off this whole cascade, won’t activate at all since it requires b.service with an After= dependency and b.service failed.

Real life examples

Finally, let’s try to see how these requirement dependencies are used in real life.

We already saw that systemd-journald-flush.service has an After= dependency on systemd-journald.service. It also has a Requires= dependency on it, so that it won’t run unless journald is already running.

Next, let’s look at system bootup. sysinit.target is a special systemd unit. It has Wants= dependencies on a lot of units needed for system initialization; for example, a dependency on systemd-journald.service which starts journald; or a dependency on systemd-modules-load.service which is responsible for loading kernel modules. Note that it makes sense to use a Wants= dependency here. sysinit.target is a fundamental target during bootup, and in fact every service unit automatically has Requires= and After= dependencies on it (unless turned off with DefaultDependencies=no). So even if for example journald did not start successfully, we still want the bootup to continue. Otherwise, we could not even fix the problem. (If you check these examples yourself, you’ll notice that there is actually no Wants=systemd-journald.service line in sysinit.target. Instead in /lib/systemd/system, there is a subdirectory sysinit.target.wants, and in it, a symlink to systemd-journald.service. This is an alternative way to specify the requirement dependencies, but it only works for Wants= and Requires= dependencies. We’ll see this in more detail in the next part of this series.)

Finally, let’s take a look at system shutdown. halt.target has Requires= and After= dependencies on systemd-halt.service which does the actual shutdown (we did something similar for our minimal systemd setup in the previous part) systemd-halt.service in turn has Requires= and After= dependencies on shutdown.target. By default, every service unit gets automatic Conflicts= and Before= dependencies on shutdown.target.

A DAG showing requirement and ordering dependencies between halt.target, systemd-halt.service, shutdown.target, and three example services

This means that when halt.target is activated, this pulls in systemd-halt.service and in turn shutdown.target. The Conflicts= dependencies cause all services to shut down. The ordering dependencies ensure that first all services are shut down, then systemd-halt.service is executed, and then halt.target is marked as active.

Conclusion

systemd’s dependency system is quite elaborate and can be overwhelming, especially since some dependencies are automatically defined by systemd, and since there are different ways to define dependencies. In my opinion, the best way to deal with this is to look at many examples and to experiment with the features, trying out their default behavior and also their edge cases. In this article, we have seen some ways to do this for the most common dependencies. As always, you learn a lot more by trying things yourself than by just reading an article, so I’d encourage you to conduct your own experiments. (If you need inspiration what to do, you could create examples for the three requirement dependencies that we didn’t cover here; our you could introduce cycles in the dependency graph so that it’s no longer a DAG and see how systemd deals with this.)

So far, we have used service units mostly as a tool to control the states of a unit. In the next part of this series, I’m planning to take a closer look at service units and their real world usage.

—Written by Sebastian Jambor. Follow me on Mastodon @crepels@mastodon.social for updates on new blog posts.