qimpy.rc
Run configuration / hardware resources. This includes CPU cores or GPU, and MPI communicators to be used by the current QimPy instance. The import-time configuration selects a single CPU core for each MPI process in mpi4py.MPI.COMM_WORLD.
Call init to select the number of cores or a GPU device, as available and based on environment variables including SLURM_CPUS_PER_TASK and CUDA_VISIBLE_DEVICES.
Note that init must be called before any torch CUDA calls, so that a single CUDA context is associated with this process. Otherwise, on multi-GPU systems, any CUDA MPI will subsequently fail. To mitigate this potential issue whenever possible, this module uses SLURM_LOCALID or OMPI_COMM_WORLD_LOCAL_RANK to pick a specific GPU and alter CUDA_VISIBLE_DEVICES before any torch or MPI calls.
Module Attributes
Global communicator for QimPy |
|
Rank within comm |
|
Size of comm |
|
Whether head of comm |
|
CPU torch device |
|
Preferred torch device for calculation (CPU / GPU) |
|
Whether device is a CUDA GPU |
|
Asynchronous CUDA compute stream |
Functions
Make compute_stream (if used) wait on current stream. |
|
Make current stream wait on compute_stream (if used). |
|
Wait for all tasks in current CUDA stream to complete. |
|
Time in seconds since start of this run. |
|
Report end time and duration. |