Newest 'hpc' Questions

1 vote

1 answer

96 views

Best practices for SLURM job pipeline with wrapper scripts - avoiding complex job ID extraction

I'm building a SLURM pipeline where each stage is a bash wrapper script that generates and submits SLURM jobs. Currently I'm doing complex job ID extraction which feels clunky: # Current approach ...

desert_ranger

1,851

asked Sep 23 at 20:20

1 vote

0 answers

39 views

How to run Neo4j Docker container using Singularity on HPC without shutdown during data import?

I'm trying to run the Neo4j Docker container using Singularity on an HPC system. The container starts successfully, but it shuts down automatically when I try to add data to the database (e.g., via ...

prasad

13

asked Sep 18 at 8:59

1 vote

1 answer

56 views

Debugging parallel python program in interruptible sleep

I have a mpi4py program, which runs well with mpiexec -np 30 python3 -O myscript.py at 100% CPU usage on each of the 30 CPUs. Now I am launching 8 instances with mpiexec -np 16 python3 -O myscript.py. ...

j13r

2,721

asked Aug 25 at 15:26

1 vote

0 answers

73 views

Slurm: salloc gets allocated then fails immediately with ExitCode=1:0 (Start=End same second), while equivalent sbatch works

I’ve been using salloc to allocate compute nodes without issues before. Recently, after switching to another user account (same .bashrc config, only the conda path changed), salloc stopped working. I ...

Calculus007

10

asked Aug 11 at 3:09

0 votes

0 answers

41 views

Postgresql, Postgis, QGIS in container launched from charliecloud

I need to migrate my work for geospatial processing (using mainly qgis processing and postgis functions from python scripts) to a HPC cluster. As neither qgis nor postgis are installed on the HPC I ...

Felix_geospatial

1

asked Jul 16 at 13:38

0 votes

1 answer

142 views

Spack `spack load` not setting LD\_LIBRARY\_PATH or CPATH environment variables as expected

I'm using Spack on Linux Mint to manage scientific libraries, including armadillo. I have installed Armadillo and its dependencies via Spack in an enviroment. Problem: When I run spack load armadillo, ...

jorge isaac rubiano

1

asked Jul 11 at 22:54

0 votes

0 answers

50 views

slurmstepd: error: execve(): mkdir: No such file or directory

I tried to use the sbatch file from this link (Running WindNinja on an HPC Cluster) to run the WindNinja software (WindNinja introduction) installed on HPC. However, it always produce the "...

Kaiyuan Zheng

33

asked Jul 7 at 9:58

0 votes

0 answers

63 views

How to force Slurm to pack GPU jobs onto partially occupied nodes to free full nodes?

When users request 1-2 GPUs via sbatch --gres=gpu:1, Slurm locks the entire 8-GPU node. This fragments our cluster: Multiple small requests spread across nodes (e.g., four 1-GPU jobs occupy four ...

train-server

1

asked Jun 26 at 19:16

0 votes

1 answer

48 views

how to use mkl_dcsrgemv or other functions in OneAPI to cal. scalar prodoct between mass dim sparse matrix and vector?

I program in fortran with Intel OneAPI compiler ifx and MKL packages. I want to cal. the scalar product between a mass dim sparse matrix and a vector. When the dim of the sparse matrix could be ...

River Chandler

1

asked Jun 17 at 8:56

0 votes

1 answer

76 views

How can I run snakemake jobs 'remotely'?

I love snakemake and have used it locally as well as on HPC with SLURM! However, now we have a particular setup where it is not as easy to use snakemake as we have done before: We need to run some ...

Sebastian Beyer

170

asked Jun 16 at 10:39

0 votes

0 answers

45 views

Sample UCP AM client failing with error "Destination is unreachable" for localhost

I'm learning UCX by creating a basic wrapper for both the client and server. I am using AM communication. When I run my client, I get below error : [1749297901.816001] [prateek:19822:0] ...

Prateek Joshi

4,095

asked Jun 7 at 18:49

0 votes

0 answers

85 views

Can I use MPI_File_read_all to read non contiguous datatypes directly (as opposed to setview)?

I'm trying to read different subsets of non-contiguous data from a file to different processes. Ie: I have a file with the data: a b c d e f g h i j and two processes who want to read the data from ...

Subject303

15

asked May 28 at 15:06

1 vote

2 answers

88 views

What is the difference between an MPI nonblocking collective write, iwrite_all vs a "nonblocking" noncollective iwrite combined with a file sync?

I'm setting up IO for a largescale CFD code using the MPI library and the file IO is starting to eat into computation time as my problems scale. As far as I can find the "done" thing in the ...

Subject303

15

asked Apr 29 at 4:36

0 votes

0 answers

44 views

Slurm partitions on same node overallocating CPUs

I have a single computation node with 32 CPUs. I have defined two different partitions that both use this node. If I for example send two jobs on partition A requesting 20 CPUs and 25 CPUs, the second ...

Daniel

1

asked Apr 14 at 14:46

0 votes

1 answer

59 views

Snakemake access snakemake.config in profile config.yaml file

I want to run a pipeline on a cluster where the name of the jobs are of the form : smk-{config["simulation"]}-{rule}-{wildcards}. Can I just do : snakemake --profile slurm --configfile ...

Kiffikiffe

153

asked Apr 2 at 16:02

Collectives™ on Stack Overflow

Best practices for SLURM job pipeline with wrapper scripts - avoiding complex job ID extraction

How to run Neo4j Docker container using Singularity on HPC without shutdown during data import?

Debugging parallel python program in interruptible sleep

Slurm: salloc gets allocated then fails immediately with ExitCode=1:0 (Start=End same second), while equivalent sbatch works

Postgresql, Postgis, QGIS in container launched from charliecloud

Spack `spack load` not setting LD\_LIBRARY\_PATH or CPATH environment variables as expected

slurmstepd: error: execve(): mkdir: No such file or directory

How to force Slurm to pack GPU jobs onto partially occupied nodes to free full nodes?

how to use mkl_dcsrgemv or other functions in OneAPI to cal. scalar prodoct between mass dim sparse matrix and vector?

How can I run snakemake jobs 'remotely'?

Sample UCP AM client failing with error "Destination is unreachable" for localhost

Can I use MPI_File_read_all to read non contiguous datatypes directly (as opposed to setview)?

What is the difference between an MPI nonblocking collective write, iwrite_all vs a "nonblocking" noncollective iwrite combined with a file sync?

Slurm partitions on same node overallocating CPUs

Snakemake access snakemake.config in profile config.yaml file

Hot Network Questions