I have a service that runs a slow and buggy one-shot program once a week that may hang sometimes for no apparent reason. I want this service to be restarted if it took longer than 4 hours to run. I tried to set it up to restart after 4 hours if it is still running with the following unit definition:
[Unit]
Description=Some buggy software
[Service]
WorkingDirectory=/home/buggy
ExecStart=/home/buggy/run
Environment=NODE_VERSION=14
Restart=on-failure
RestartSec=4 h
StartLimitBurst=4
StartLimitInterval=1 s
And a timer:
[Unit]
Description=Some buggy software schedule timer
[Timer]
Unit=buggy.service
OnCalendar=Sat *-*-* 22:00:00
[Install]
WantedBy=timers.target
But it still can hang indefinitely, I came to see my service two days after it run and it was still hung.
What did I do wrong? Is there even a way to do what I need with systemd?
fork()into the background, but blocks until completed?Type=oneshot, and/or probably useTimeoutSec=