Skip to main content
added 52 characters in body
Source Link

byBy seeing the systemctl configuration you've provided, I can tell this things:

1- As for slurmdslurmd, the HW configuration you defined in slurm.conf is not correct. What are the HW specifications of the node this configuration will run on?

 (Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd[6514]: slurmd: error: Node configuration differs from hardware: CPUs=12:12(hw) Boards=1:1(hw) SocketsPerBoard=12:1(hw) CoresPerSocket=1:6(hw) ThreadsPerCore>) 

According to this output, your values for SocketsPerBoardSocketsPerBoard and CoresPerSocketCoresPerSocket, should be 11 and 66 respectively.

2- Regarding slurmctldslurmctld, the initial node status should be UNKNOWNUNKNOWN, like this.

 NodeName=localhost CPUs=12 RealMemory=30517 State=UNKNOWN PartitionName=localhost Nodes=ALL Default=YES MaxTime=INFINITE State=UP

NOTE: I have seen you have put "8000""8000" as your RealMemoryRealMemory value. Try using the value "8192""8192" instead, as Slurm uses MiB values :)

Try changing these, then restartrestart both slurmdslurmd and slurmctldslurmctld and let me know if that helps.

Cheers!

by seeing the systemctl configuration you've provided, I can tell this things:

1- As for slurmd, the HW configuration you defined in slurm.conf is not correct. What are the HW specifications of the node this configuration will run on?

 (Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd[6514]: slurmd: error: Node configuration differs from hardware: CPUs=12:12(hw) Boards=1:1(hw) SocketsPerBoard=12:1(hw) CoresPerSocket=1:6(hw) ThreadsPerCore>) 

According to this output, your values for SocketsPerBoard and CoresPerSocket, should be 1 and 6 respectively.

2- Regarding slurmctld, the initial node status should be UNKNOWN, like this.

 NodeName=localhost CPUs=12 RealMemory=30517 State=UNKNOWN PartitionName=localhost Nodes=ALL Default=YES MaxTime=INFINITE State=UP

NOTE: I have seen you have put "8000" as your RealMemory value. Try using the value "8192" instead, as Slurm uses MiB values :)

Try changing these, then restart both slurmd and slurmctld and let me know if that helps.

Cheers!

By seeing the systemctl configuration you've provided, I can tell this things:

1- As for slurmd, the HW configuration you defined in slurm.conf is not correct. What are the HW specifications of the node this configuration will run on?

 (Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd[6514]: slurmd: error: Node configuration differs from hardware: CPUs=12:12(hw) Boards=1:1(hw) SocketsPerBoard=12:1(hw) CoresPerSocket=1:6(hw) ThreadsPerCore>) 

According to this output, your values for SocketsPerBoard and CoresPerSocket, should be 1 and 6 respectively.

2- Regarding slurmctld, the initial node status should be UNKNOWN, like this.

 NodeName=localhost CPUs=12 RealMemory=30517 State=UNKNOWN PartitionName=localhost Nodes=ALL Default=YES MaxTime=INFINITE State=UP

NOTE: I have seen you have put "8000" as your RealMemory value. Try using the value "8192" instead, as Slurm uses MiB values :)

Try changing these, then restart both slurmd and slurmctld and let me know if that helps.

Cheers!

Source Link

by seeing the systemctl configuration you've provided, I can tell this things:

1- As for slurmd, the HW configuration you defined in slurm.conf is not correct. What are the HW specifications of the node this configuration will run on?

 (Mar 05 05:57:17 thoma-Lenovo-Legion-5-15IMH05H slurmd[6514]: slurmd: error: Node configuration differs from hardware: CPUs=12:12(hw) Boards=1:1(hw) SocketsPerBoard=12:1(hw) CoresPerSocket=1:6(hw) ThreadsPerCore>) 

According to this output, your values for SocketsPerBoard and CoresPerSocket, should be 1 and 6 respectively.

2- Regarding slurmctld, the initial node status should be UNKNOWN, like this.

 NodeName=localhost CPUs=12 RealMemory=30517 State=UNKNOWN PartitionName=localhost Nodes=ALL Default=YES MaxTime=INFINITE State=UP

NOTE: I have seen you have put "8000" as your RealMemory value. Try using the value "8192" instead, as Slurm uses MiB values :)

Try changing these, then restart both slurmd and slurmctld and let me know if that helps.

Cheers!