Systemd with multiple instances and restart on fail

Question

I have created with systemd a structure with multiple instances to run the same program several times with different parameters. My intention is that each instance is independent from the others, and in case one fails, it will be restarted, leaving the others unchanged.

This is my target unit:

[Unit]
Description=Cutter
After=FD-go-00_tree.service
After=FD-go-01_pre.service
[email protected] [email protected] [email protected] [email protected] 

[Install]
WantedBy=FD-go-00_tree.service

This is my service unit:

[Unit]
Description="FD-cutter # %i - instance"
After=FD-go-00_tree.service
After=FD-go-01_pre.service
PartOf=FD-go-05_cutter.target
ConditionPathExists=/home/himarc/projects/multi-service/EnvironmentFile/FD-go-05_cutter_%i

# StartLimitIntervalSec in recent systemd versions
StartLimitInterval=0

[Service]
Type=simple
EnvironmentFile=/home/himarc/projects/multi-service/EnvironmentFile/FD-go-05_cutter_%i
ExecStart=/usr/bin/nice -n -1 /home/himarc/projects/bin/FD-cutter ${MyInput_1} ${MyInput_2} ${MyPath} %i
StandardOutput=file:/srv/FD/%i/trace/FD-log-cutter.log
StandardError=file:/srv/FD/%i/trace/FD-log-cutter.log
Restart=always

# time to sleep before restarting a service
RestartSec=1

[Install]
WantedBy=FD-go-00_tree.service

In case of an instance failure, not the single service is restarted but the entire target unit.

Aug 25 11:15:06 localhost kernel: [2493693.364584] FD-cutter[21251]: segfault at 4c8 ip 000055d7ee0d9e28 sp 00007f312186caf0 error 6 in FD-cutter[55d7ee0d2000+1a000]
Aug 25 11:15:06 localhost kernel: [2493693.364591] Code: f8 ff ff 48 8d 15 08 22 21 00 48 8d 35 d1 26 21 00 48 8b 05 32 25 21 00 48 8d 3d 2b 25 21 00 48 c7 05 e8 21 21 00 00 00 00 00 <48> 89 88 c8 04 00 00 48 89 90 d0 04 00 00 48 89 e9 31 d2 e8 e0 dd
Aug 25 11:15:06 localhost systemd[1]: [email protected]: Main process exited, code=killed, status=11/SEGV
Aug 25 11:15:06 localhost systemd[1]: [email protected]: Failed with result 'signal'.
Aug 25 11:15:08 localhost systemd[1]: [email protected]: Service hold-off time over, scheduling restart.
Aug 25 11:15:08 localhost systemd[1]: [email protected]: Scheduled restart job, restart counter is at 1.
Aug 25 11:15:08 localhost systemd[1]: Stopped target Cutter.
Aug 25 11:15:08 localhost systemd[1]: Stopping Cutter.
Aug 25 11:15:08 localhost systemd[1]: Stopping "FD-cutter # RC1P112 - instance"...
Aug 25 11:15:08 localhost systemd[1]: Stopping "FD-cutter # RC1P111 - instance"...
Aug 25 11:15:08 localhost systemd[1]: Stopping "FD-cutter # RC1P212 - instance"...
Aug 25 11:15:08 localhost systemd[1]: Stopped "FD-cutter # RC1P211 - instance".
Aug 25 11:15:08 localhost systemd[1]: Started "FD-cutter # RC1P211 - instance".
Aug 25 11:15:08 localhost systemd[1]: Stopped "FD-cutter # RC1P112 - instance".
Aug 25 11:15:08 localhost systemd[1]: Stopped "FD-cutter # RC1P111 - instance".
Aug 25 11:15:08 localhost systemd[1]: Stopped "FD-cutter # RC1P212 - instance".
Aug 25 11:15:08 localhost systemd[1]: Started "FD-cutter # RC1P212 - instance".
Aug 25 11:15:08 localhost systemd[1]: Started "FD-cutter # RC1P111 - instance".

Is there a way to restart only the failed service unit, without restart the others?

Stewart · Accepted Answer · 2020-08-26 20:15:09Z

You added a Requires= relationship between the target and the instance. This is a pretty strong relationship. According to systemd.unit(5) this means:

Requires=

Similar to Wants=, but declares a stronger dependency. Dependencies of this type may also be configured by adding a symlink to a .requires/ directory accompanying the unit file.

If this unit gets activated, the units listed will be activated as well. If one of the other units fails to activate, and an ordering dependency After= on the failing unit is set, this unit will not be started. Besides, with or without specifying After=, this unit will be stopped if one of the other units is explicitly stopped.

Often, it is a better choice to use Wants= instead of Requires= in order to achieve a system that is more robust when dealing with failing services.

So, if you change the Requires= to Wants=, then starting the target will start the templated services, but a failure of the templated service will not affect the target.

Stack Exchange Network

Systemd with multiple instances and restart on fail

1 Answer 1

You must log in to answer this question.

Hot Network Questions

Systemd with multiple instances and restart on fail

1 Answer 1

You must log in to answer this question.

Related

Hot Network Questions