1

I am trying to create a fully functional clone of my Linux system's hard drive - meaning I would be able to replace the original hard drive with a clone, boot from it and continue using it. I have used Rescuezilla and Clonezilla, both of which offer the disk-to-disk cloning option, but both clones - despite the cloning procedure completed successfully - fail to boot taking me to the emergency mode thus making both clones unusable :( I'm running the Rocky Linux 9.2 with KVM hosting a guest OS. I gathered some info from both the source disk: the source disk and the cloned disk: the cloned disk I do not see any difference there, but given I am new to the Linux world I could have missed some important detail. I will be extremely grateful for any help resolving this problem. Thank you everyone! Mike

To answer telcoM:

Hello telcoM, and thank you so much for your explanations. I am sorry in advance for my ignorance, because like I mentioned I am totally new to the Linux world. Anyway, even though I do not have a direct access to the server now I can answer some assumptions and clarify some points as well.

  1. You are right: there are no other OSs on this disk (except for the virtual machine-residing guest, which is not bootable certainly) - and no other OS is planned, therefore I am not interested in multibooting;
  2. I believe NVRAM is not a reason here because I was trying so far to replace the original disk with the clone in the same computer, thus it is the same;
  3. The source disk - at least (like I said I have no access to the clone as of now) already contains the /boot/efi/EFI/BOOT/BOOTX64.efi - that is I do not need to place it there;
  4. I did not have both disks inserted at the same time (I tried it first time I ran cloning, but no more :)) The screenshots were taken in succession: one in the rescue mode having the clone disk inserted, while the second - after I reinserted the original drive and rebooted the system. Would you recommend anything else to check/do? I greatly appreciate you help as well as anyone else's! Mike
7
  • Maybe run fdisk -l against both? My theory is that the cloning software chose to partition the new disk MBR instead of GPT. Commented May 13, 2024 at 22:48
  • 1
    @davolfman Note that the PARTUUIDs on the cloned disk are real, full-length UUIDs - on a MBR partition table that can't happen: there is simply not enough space in the MBR for per-partition UUIDs. The new disk is definitely GPT-partitioned. Commented May 13, 2024 at 23:05
  • 2
    Please don't post pictures of text unless you absolutely have no way to copy and paste. They're almost impossible to read Commented May 13, 2024 at 23:19
  • Ah, I missed the fact that the boot actually begins but you end up in emergency mode. Then the question is, "what is the error message"? Or if you run journalctl -b in emergency mode, does it indicate what is failing? Commented May 14, 2024 at 6:50
  • Both images indicate that some sort of live/rescue media is present as /dev/sdb1. The rescue mode has successfully mounted all the filesystems listed in your /etc/fstab, so there should be no missing filesystems nor LVM activation issues. I'm afraid your original system disk may have had some unrelated problem waiting for to become obvious on reboot, and now the clone has it too. Commented May 14, 2024 at 7:05

2 Answers 2

2

(Rewritten based on new information from OP, including the journal log)

The system successfully passes through the initramfs boot phase, and switches to the real root filesystem at 13:00:30:

May 16 13:00:30 DevelDL380 systemd[1]: Starting Switch Root...
░░ Subject: A start job for unit initrd-switch-root.service has begun execution
░░ Defined-By: systemd
░░ Support: https://wiki.rockylinux.org/rocky/support
░░ 
░░ A start job for unit initrd-switch-root.service has begun execution.
░░ 
░░ The job identifier is 96.
May 16 13:00:30 DevelDL380 systemd[1]: Switching root.

The first sign of trouble I can see is at 13:02:05:

May 16 13:02:05 DevelDL380 systemd[1]: dev-mapper-rl00\x2dhome.device: Job dev-mapper-rl00\x2dhome.device/start timed out.
May 16 13:02:05 DevelDL380 systemd[1]: Timed out waiting for device /dev/mapper/rl00-home.
░░ Subject: A start job for unit dev-mapper-rl00\x2dhome.device has failed
░░ Defined-By: systemd
░░ Support: https://wiki.rockylinux.org/rocky/support
░░ 
░░ A start job for unit dev-mapper-rl00\x2dhome.device has finished with a failure.
░░ 
░░ The job identifier is 167 and the job result is timeout.
May 16 13:02:05 DevelDL380 systemd[1]: Dependency failed for /home.
░░ Subject: A start job for unit home.mount has failed
░░ Defined-By: systemd
░░ Support: https://wiki.rockylinux.org/rocky/support
░░ 
░░ A start job for unit home.mount has finished with a failure.
░░ 
░░ The job identifier is 166 and the job result is dependency.

The logical volume device /dev/mapper/r100-home could not be activated, and as a result, the mounting of the /home filesystem fails.

Unless the /home filesystem was mounted with the nofail mount option (which is not a standard procedure), this will cause the system to drop into emergency mode, as a filesystem that is defined as non-optional is missing.

The messages you listed:

  • x86/CPU: SGX disabled by BIOS - Not a problem, refers to Intel Software Guard processor extensions that turned out to have some security weaknesses in them, so most vendors either outright disabled them in firmware or made them optional.
  • i8042: can't read CTR while initializing i8042 - Refers to the classic way of reading the PS/2 keyboard controller. In modern systems, this won't always work and the kernel has other methods available. Not a problem unless the other methods also don't work and you are using a keyboard with a PS/2 connector.
  • integrity: Problem loading X.509 certificate - One of the Secure Boot certificates of the system was either a duplicate or otherwise not useful for the Integrity Subsystem. Not a problem, unless you have configured the Integrity Subsystem to use that particular certificate.
  • You are in emergency mode... - that is expected, as the /home filesystem could not be mounted, as I described above.

Has the /home filesystem perhaps been extended beyond the /dev/sda3 logical volume? (If the lsblk output on the original system indicates the r100-home logical volume is present on more than one physical disk, then this is the case.)

If so, you could perhaps still make use of the clone if you use the emergency mode to comment out the /home filesystem line in /etc/fstab. Then the clone should boot more or less normally, although without the /home filesystem. You could then recreate the /home filesystem and its LVM logical volume on the clone in whatever configuration you need it.

In emergency mode on the clone, you could try activating the logical volume manually to see if if it will work when retried, or what kind of error messages will be produced once you try:

lvchange -ay /dev/mapper/r100-home

You might also run vgscan to see if it produces any error messages, and then proceed accordingly.

If it turns out the LV can be activated successfully with manual commands, that would suggests the clone disk may be unreliable. If it is new, it might be time for a warranty replacement.

3
  • Hello telcoM, and thank you once again! Since this site does not allow the lengthy comments, I have added them to the original problem description. Best regards, Mike Commented May 14, 2024 at 1:57
  • "find any Linux live boot media that boots in UEFI mode (this bit is vital) and includes the efibootmgr command". I asked an AI entity to answer this question, and got "The System Rescue CD is a lightweight Linux distribution designed for system recovery and maintenance tasks. It supports booting in UEFI mode and includes the efibootmgr utility,". I don't have time to test this assertion myself. Added as an FYI. Cheers Commented May 14, 2024 at 14:04
  • Hi telcoM, unfortunately your advice did not help. I tried booting from almost every file located in /EFI/rocky (there are shim.efi, shimx64.efi,shimx64-rocky.efi, grubx64.efi) - all failed miserably in the same manner. Then I saved the journal and noticed strange messages at 13:02:05. May that be the reason for the failure? Why could it happen? Thank you once again! Commented May 16, 2024 at 23:42
0

I will be extremely grateful for any help resolving this problem

having done actual "cloning" years ago, with SLES 11.4 Linux with that boot loader being ELILO and not GRUB which made things much easier I will share the following:

  • it is rather simple to tar -cf image.tar /mountedOSdisk; you would have to slave your operating system disk to some other running system, you won't be able to tar a running disk. Then just format some new disk with the same file system type (XFS or EXT4 for example) and then untar your linux image to it.
    • the problem happens with the boot loader part, using ELILO which is now obsolete all you had to do was edit just /etc/elilo.conf and make sure it pointed to /dev/sda and just have that one disk in the system to ensure it would be sda.
    • when GRUB2 became mainstream, that simplicity was gone
  • given it is rather easier to install rhel/rocky linux from iso, doing this basically guarantees a clean working installation, just document/understand the install options
    • manage your applications, configuration, and data separate from the linux operating system; simply reinstall and reconfigure after a reinstall of the linux operating system

With defining the disk in the boot loader, there is by-name and by-uuid and by-label to name a few. See what's under /dev/disk/. Using by-name is dangerous because what is sda and sdb and sdc and so on will change given the number of disks that change in the system or how they get connected. At best you can reasonably guarantee it being sda if it is the only disk [block device] that will be recognized, otherwise you have to do the work of getting the UUID syntax and having grub use that for reliability. This is often the problem with what you are asking - i didn't even read fully your post to try to understand what actually might be happening; I've been down that road, I've since backed up and turned down a different road which is reinstall from iso and let that manage grub setup, and have been very happy. Keeping your applications, configurations, and data separate is a guaranteed way of recovery... versus the risk of corruption on a cloned disk. Often when trying to use a cloned installation of however old I was always updating the applications because a new version was available. And many other things change so often anyway that it makes a "cloned" linux install pointless to reuse after a few months (in my opinion). I don't know how many times with RHEL 7 from 7.2 to 7.9 had I done a "clone" that would never have gotten used because I'd be jumping to a newer 7.x version.

2
  • Hi ron, thank you for your detailed message and sharing your experience. However, consider a situation when you have to create a system for a customer - and this system contains a dozen (or so) almost identical nodes. By cloning I can install one, clone it for each and every other computer, and then slightly (in fact mostly automatically) tweak them - versus installing each and every of them from scratch. I am not even talking about having a customer's cloned disk in house for any type of more or less complicated debugging. That is why the cloning is of extreme importance to me. Commented May 15, 2024 at 20:49
  • Hi everyone, the things got much worse: my original (the only working) disk died on me. Thus, I am left with only the clone, unusable so far. When I tried to boot from it again (failing again), I saved the journal. Could anyone help me finding out what's wrong based on it? I suspect it has something to do with the messages at 13:02:05 about failing to map /home... what could cause it? Commented May 16, 2024 at 23:24

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.