1

The mysql.service got killed by the OOM killer. While investigating the root cause I wanted to change the unit configuration to restart if killed. I was surprised to find

Restart=on-abort

already in the default unit configuration file. Reading https://www.freedesktop.org/software/systemd/man/latest/systemd.service.html#Restart= I think when getting killed by OOM systemd should restart the service. Testing on a non production server with kill -9 pid shows the expected behavior, the service is automatically restarted.

A simple systemctl restart mysql.service did work, so there should be nothing preventing systemd from bringing the service back up. So why is my service not restarted ?

Edit to anwser request in the comments

  • The System is Debian 12
  • mysql just symlinks to mariadb . But it is mariadb.
  • The unit file is the default delivered by the maintainers
# It's not recommended to modify this file in-place, because it will be
# overwritten during package upgrades.  If you want to customize, the
# best way is to create a file "/etc/systemd/system/mariadb.service",
# containing
#   .include /usr/lib/systemd/system/mariadb.service
#   ...make your changes here...
# or create a file "/etc/systemd/system/mariadb.service.d/foo.conf",
# which doesn't need to include ".include" call and which will be parsed
# after the file mariadb.service itself is parsed.
#
# For more info about custom unit files, see systemd.unit(5) or
# https://mariadb.com/kb/en/mariadb/systemd/
#
# Copyright notice:
#
# This file is free software; you can redistribute it and/or modify it
# under the terms of the GNU Lesser General Public License as published by
# the Free Software Foundation; either version 2.1 of the License, or
# (at your option) any later version.

[Unit]
Description=MariaDB 10.11.6 database server
Documentation=man:mariadbd(8)
Documentation=https://mariadb.com/kb/en/library/systemd/
After=network.target

[Install]
WantedBy=multi-user.target


[Service]

##############################################################################
## Core requirements
##

Type=notify

# Setting this to true can break replication and the Type=notify settings
# See also bind-address mariadbd option.
PrivateNetwork=false

##############################################################################
## Package maintainers
##

User=mysql
Group=mysql

# CAP_IPC_LOCK To allow memlock to be used as non-root user
# CAP_DAC_OVERRIDE To allow auth_pam_tool (which is SUID root) to read /etc/shadow when it's chmod 0
#   does nothing for non-root, not needed if /etc/shadow is u+r
# CAP_AUDIT_WRITE auth_pam_tool needs it on Debian for whatever reason
CapabilityBoundingSet=CAP_IPC_LOCK CAP_DAC_OVERRIDE CAP_AUDIT_WRITE

# PrivateDevices=true implies NoNewPrivileges=true and
# SUID auth_pam_tool suddenly doesn't do setuid anymore
PrivateDevices=false

# Prevent writes to /usr, /boot, and /etc
ProtectSystem=full



# Doesn't yet work properly with SELinux enabled
# NoNewPrivileges=true

# Prevent accessing /home, /root and /run/user
ProtectHome=true

# Execute pre and post scripts as root, otherwise it does it as User=
PermissionsStartOnly=true

ExecStartPre=/usr/bin/install -m 755 -o mysql -g root -d /var/run/mysqld

# Perform automatic wsrep recovery. When server is started without wsrep,
# galera_recovery simply returns an empty string. In any case, however,
# the script is not expected to return with a non-zero status.
# It is always safe to unset _WSREP_START_POSITION environment variable.
# Do not panic if galera_recovery script is not available. (MDEV-10538)
ExecStartPre=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION"
ExecStartPre=/bin/sh -c "[ ! -e /usr/bin/galera_recovery ] && VAR= || \
 VAR=`cd /usr/bin/..; /usr/bin/galera_recovery`; [ $? -eq 0 ] \
 && systemctl set-environment _WSREP_START_POSITION=$VAR || exit 1"

# Needed to create system tables etc.
# ExecStartPre=/usr/bin/mysql_install_db -u mysql

# Start main service
# MYSQLD_OPTS here is for users to set in /etc/systemd/system/mariadb.service.d/MY_SPECIAL.conf
# Use the [Service] section and Environment="MYSQLD_OPTS=...".
# This isn't a replacement for my.cnf.
# _WSREP_NEW_CLUSTER is for the exclusive use of the script galera_new_cluster
ExecStart=/usr/sbin/mariadbd $MYSQLD_OPTS $_WSREP_NEW_CLUSTER $_WSREP_START_POSITION

# Unset _WSREP_START_POSITION environment variable.
ExecStartPost=/bin/sh -c "systemctl unset-environment _WSREP_START_POSITION"

ExecStartPost=/etc/mysql/debian-start

KillSignal=SIGTERM

# Don't want to see an automated SIGKILL ever
SendSIGKILL=no

# Restart crashed server only, on-failure would also restart, for example, when
# my.cnf contains unknown option
Restart=on-abort
RestartSec=5s

UMask=007

##############################################################################
## USERs can override
##
##
## by creating a file in /etc/systemd/system/mariadb.service.d/MY_SPECIAL.conf
## and adding/setting the following under [Service] will override this file's
## settings.

# Useful options not previously available in [mysqld_safe]

# Kernels like killing mariadbd when out of memory because its big.
# Lets temper that preference a little.
# OOMScoreAdjust=-600

# Explicitly start with high IO priority
# BlockIOWeight=1000

# If you don't use the /tmp directory for SELECT ... OUTFILE and
# LOAD DATA INFILE you can enable PrivateTmp=true for a little more security.
PrivateTmp=false

# Set an explicit Start and Stop timeout of 900 seconds (15 minutes!)
# this is the same value as used in SysV init scripts in the past
# Galera might need a longer timeout, check the KB if you want to change this:
# https://mariadb.com/kb/en/library/systemd/#configuring-the-systemd-service-timeout
TimeoutStartSec=900
TimeoutStopSec=900

##
## Options previously available to be set via [mysqld_safe]
## that now needs to be set by systemd config files as mysqld_safe
## isn't executed.
##

# Number of files limit. previously [mysqld_safe] open-files-limit
LimitNOFILE=32768
# For liburing and io_uring_setup()
LimitMEMLOCK=524288
# Maximium core size. previously [mysqld_safe] core-file-size
# LimitCore=

# Nice priority. previously [mysqld_safe] nice
# Nice=-5

# Timezone. previously [mysqld_safe] timezone
# Environment="TZ=UTC"

# Library substitutions. previously [mysqld_safe] malloc-lib with explicit paths
# (in LD_LIBRARY_PATH) and library name (in LD_PRELOAD).
# Environment="LD_LIBRARY_PATH=/path1 /path2" "LD_PRELOAD=

# Flush caches. previously [mysqld_safe] flush-caches=1
# ExecStartPre=sync
# ExecStartPre=sysctl -q -w vm.drop_caches=3

# numa-interleave=1 equalivant
# Change ExecStart=numactl --interleave=all /usr/sbin/mariadbd......

# crash-script equalivent
# FailureAction=

4
  • Is it myql.service or mariadb.service? What operating system and release? Add the rest of the service file to the question. Commented Jun 3, 2024 at 23:48
  • 1
    Not an expert here - but the oom killer sends either sigterm(15) or sigkill (9) depending on "something" - so probably it got the sigterm. Commented Jun 4, 2024 at 9:11
  • @Lutz You are probably right. kernel.org/doc/gorman/html/understand/understand016.html says "If the process has CAP_SYS_RAWIO capabilities, a SIGTERM is sent to give the process a chance of exiting cleanly, otherwise a SIGKILL is sent." But getcaps <PID_of_mysql> is empty. So the question is why is OOM sending sigterm not sigkill. Commented Jun 4, 2024 at 12:41
  • you could change to Restart to on-failure, then it should restart on either signal Commented Jun 6, 2024 at 9:07

2 Answers 2

0

Googling this exact issue led me here, and also to where the change in Maria apparently happened: https://jira.mariadb.org/browse/MDEV-11869

While implementing systemd packaging part for MariaDB 10.1 in Debian and while comparing to MySQL 5.7 in Debian/Ubuntu, I noticed that the mysql.service file recommends against using Restart=on-failure and instead suggests on-abort:

Gotta say, it ain't ideal when oomkiller takes down a production server and systemd won't automatically restart it.

3
  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review Commented Jul 31, 2024 at 11:25
  • Being in a situation where OOM-killer is forced to take out the production server db, only for it to immediately restart and possibly reproduce the sequence of events that led to the OOM kill in the first place is also far from ideal. I'm not necessarily saying that I agree with the reasoning that if mariadb gets OOMkilled it shouldn't come back up until an actual sysadmin has taken a look and identified the cause, but I do see their point. Commented Apr 24 at 7:18
  • In my experience, mariadb is usually the culprit - or more specifically, the default OS malloc engine. Databases like Mariadb fragment memory a lot, which with the default on EL and Debian vms, can lead to mariadb appearing to use a lot more memory than is configured, triggering an oom. (Switching to a more agressive malloc like tcmalloc or jemalloc helps a lot with this, and is well documented, but it's not the default) systemd restarting mariadb after such a memory fragmented oom will release all the assigned memory and restart mariadb within a second or two. This is better than no dbase Commented Apr 25 at 12:16
0

It was also reported in MDEV-36009 meaning the release 10.11.12, 11.4.6, 11.8.2 and later (coming sometime soon) will contain a fix for this.

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.