4

Once a month I find one of my RedHat 9 servers has rebooted (actually it's AlmaLinux 9 but since it's a clone of RH9 this question is probably better to solve in the context of RH9). I'm trying to find out what is causing the crash, but there are no core dump files created!

I have followed the instructions in this post, except that I don't seem to have anything apport on my system, but when I trigger a core dump with:

sleep 3 & kill -SEGV $!

there is no core dump file!

I confirmed the basics are set with:

[root@myhost ~]# cat /proc/sys/kernel/core_pattern
|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h
[root@myhost ~]# ulimit -c
unlimited

Is there something else I must set to let the dump file be created? I suspect my own app (non-packaged) is causing the problem...but there is no core file even in the directory holding the app.

====UPDATE====

I modified /etc/coredump.conf and set storage=external (everything else commented out), then rebooted and ran the following:

[root@myhost ~]# sleep 3 & kill -SEGV $!
[1] 3583
[root@myhost ~]#
[1]+  Segmentation fault      (core dumped) sleep 3
[root@myhost ~]#  coredumpctl --all
TIME                         PID UID GID SIG     COREFILE EXE           SIZE
Sat 2024-10-26 12:56:46 EDT 3583   0   0 SIGSEGV none     /usr/bin/bash    -
[root@myhost ~]# ll /var/lib/systemd/coredump/
total 0

So still no core dump files visible (and notice the "none" above). The system log shows:

Oct 26 13:06:41 ngcvls1 systemd[1]: Started Process Core Dump (PID 4459/UID 0).
Oct 26 13:06:41 ngcvls1 systemd-coredump[4460]: Resource limits disable core dumping for process 4458 (bash).
Oct 26 13:06:41 ngcvls1 systemd-coredump[4460]: Process 4458 (bash) of user 0 dumped core.

So from the command line I ran:

ulimit -c unlimited

and repeated the segfault test, then a core file was created! But on reboot it was gone. (Despite my having storage=external set in coredump.conf). I need core dumps to survive reboots otherwise I can't tell why my system crashed. Getting closer! I would like to make ulimit -c permanent, just not sure where to put that (don't like advice of other posts to put in .bashrc)

4
  • 1
    Are you using coredumpctl to list the core dumps? Commented Oct 25, 2024 at 16:44
  • I was not aware of this tool, but I tried to list core dumps using coredumptctl and it did not find any. Should I be doing more with this tool? Commented Oct 25, 2024 at 20:18
  • There is a DefaultLimitCORE= in /etc/systemd/system.conf, see man systemd-system.conf. Commented Oct 26, 2024 at 17:32
  • I already have that set to infinity... Commented Oct 26, 2024 at 19:56

1 Answer 1

3

I provided some info in another answer, but the coredumpctl command with no args should list any known core dumps. This info is kept in the systemd journal. If you are deleting or not keeping the journal you won't have this information.

Systemd keeps its core files under /var/lib/systemd/coredump/, even if the journal has been cleared, I think.

To stop systemd taking over core dumping, you can do

ln -s /dev/null /etc/sysctl.d/50-coredump.conf
sysctl -w kernel.core_pattern=core

The first line overrides the configuration in /usr/lib/sysctl.d/50-coredump.conf for future boots. The second line changes the setting immediately.


There are other settings that can affect whether core dumps are made by systemd, in files

/etc/systemd/coredump.conf
/etc/systemd/coredump.conf.d/*.conf
/run/systemd/coredump.conf.d/*.conf
/usr/lib/systemd/coredump.conf.d/*.conf

See man coredump.conf. The file may show the default values as comments. The entry Storage=external means the core file will be in directory /var/lib/systemd/coredump/, otherwise they are kept within the journal log files.

If the process is bigger than ProcessSizeMax= the dump will be logged but no core dump taken. Similarly, no core dump if ExternalSizeMax= is exceeded, or JournalSizeMax=, depending on your storage choice. No core if dumps already take more than MaxUse= percent of the disk space, or if less than KeepFree= space is available. My Fedora 38 has

[Coredump]
#Storage=external
#Compress=yes
# On 32-bit, the default is 1G instead of 32G.
#ProcessSizeMax=32G
#ExternalSizeMax=32G
#JournalSizeMax=767M
#MaxUse=
#KeepFree=

For RHEL 9 see debugging applications.

6
  • The /var/lib/systemd/coredump/ has no files, my OS does not have a /etc/systemctl.d/ directory. I don't think this works for RH9 family Commented Oct 25, 2024 at 23:53
  • The /etc/systemctl.d/ directory does not have to exist if no one wants to override the corresponding files in /usr/lib/sysctl.d/. Is RH9 the same as RHEL9? doc Commented Oct 26, 2024 at 9:41
  • Getting very close based on your post - I updated my original question with more information. Only problem left is the core file disappears on reboot. Commented Oct 26, 2024 at 17:19
  • 2
    /usr/lib/tmpfiles.d/systemd.conf tells systemd-tmpfiles to delete any core dump files in /var/lib/systemd/coredump/ older than 3 days, and systemd-tmp.conf will tell it to further remove some files from that directory on every boot. To override, copy the systemd.conf and systemd-tmp.conf from /usr/lib/tmpfiles.d/ to /etc/tmpfiles.d/, then remove or comment out the lines referring to /var/lib/systemd/coredump in the copies. The originals will be replaced each time the systemd RPM is updated, so only overriding the config files via /etc/tmpfiles.d makes the change persist. Commented Oct 26, 2024 at 18:59
  • Wow this is more complicated than I expected (general feeling about systemd). I think I'll close this as solved, and investigate the tmpfiles.d seperately Commented Oct 26, 2024 at 20:10

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.