qimpy.rc

Run configuration / hardware resources. This includes CPU cores or GPU, and MPI communicators to be used by the current QimPy instance. The import-time configuration selects a single CPU core for each MPI process in mpi4py.MPI.COMM_WORLD.

Call init to select the number of cores or a GPU device, as available and based on environment variables including SLURM_CPUS_PER_TASK and CUDA_VISIBLE_DEVICES.

Note that init must be called before any torch CUDA calls, so that a single CUDA context is associated with this process. Otherwise, on multi-GPU systems, any CUDA MPI will subsequently fail. To mitigate this potential issue whenever possible, this module uses SLURM_LOCALID or OMPI_COMM_WORLD_LOCAL_RANK to pick a specific GPU and alter CUDA_VISIBLE_DEVICES before any torch or MPI calls.

Module Attributes

comm

Global communicator for QimPy

i_proc

Rank within comm

n_procs

Size of comm

is_head

Whether head of comm

cpu

CPU torch device

device

Preferred torch device for calculation (CPU / GPU)

use_cuda

Whether device is a CUDA GPU

compute_stream

Asynchronous CUDA compute stream

Functions

compute_stream_wait_current

Make compute_stream (if used) wait on current stream.

current_stream_wait_compute

Make current stream wait on compute_stream (if used).

current_stream_synchronize

Wait for all tasks in current CUDA stream to complete.

clock

Time in seconds since start of this run.

report_end

Report end time and duration.