HPC deployment
AmorphGen is designed for deployment on GPU-enabled HPC clusters via SLURM.
SLURM job script
#!/bin/bash
#SBATCH --job-name=amorphgen
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --gpus-per-task=1
#SBATCH --time=24:00:00
#SBATCH --account=your-account
module load CUDA/11.8.0
conda activate /path/to/your/env
amorphgen POSCAR --model mace-mpa-0 --device cuda
Resuming timed-out jobs
The --resume flag enables smart checkpoint detection for both pipeline and batch-quench modes. It scans the work directory for completed stage outputs and automatically skips them.
Pipeline mode
amorphgen POSCAR \
--stages 1 4 5 6 7 \
--config my_config.yaml \
--work-dir my_run/ \
--resume
If stages 1 and 4 are already complete, AmorphGen picks up from stage 5 using the stage4_eq.xyz checkpoint. If all stages are done, it exits immediately.
Batch quench mode
amorphgen --batch-quench \
--snapshot-dir snapshots/ \
--model mace-mpa-0 \
--device cuda \
--resume
This skips already-completed structures and continues from where the previous job left off.
Python API
from amorphgen import MeltQuenchPipeline
pipe = MeltQuenchPipeline(
input_file="POSCAR",
work_dir="my_run",
cfg_override={"model": "mace-mpa-0", "device": "cuda"},
)
atoms = pipe.run(stages=[1, 4, 5, 6, 7], resume=True)
Array jobs for batch processing
For running many structures in parallel (e.g. 100 AIRSS structures), use a SLURM array job:
#!/bin/bash
#SBATCH --job-name=MQ_batch
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=4
#SBATCH --mem=32G
#SBATCH --time=12:00:00
#SBATCH --array=1-100
SAMPLE=${SLURM_ARRAY_TASK_ID}
amorphgen "inputs/sample-${SAMPLE}.xyz" \
--stages 1 4 5 6 7 \
--config config.yaml \
--work-dir "results/sample_${SAMPLE}" \
--resume
Each array task runs on its own GPU. The --resume flag makes resubmission safe — completed samples are skipped automatically.