YAML configuration
AmorphGen accepts a YAML config file via --config <file> (CLI) or cfg_override=load_yaml_config(...) (Python). YAML is recommended for any non-trivial workflow because it:
Keeps simulation parameters in version control alongside the code that produced them
Makes runs reproducible — one file describes the entire protocol
Reads cleanly compared to a long CLI flag chain
Supports comments to document why each parameter is set
Configuration precedence
CLI flags > YAML config > DEFAULT_CONFIG
Anything passed on the CLI overrides the YAML; YAML overrides defaults. So you can keep a baseline YAML and tweak one parameter at the command line:
amorphgen POSCAR --config full_pipeline.yaml --device cuda --eq-high-T 4000
Structure
A YAML config is a nested dictionary. Top-level keys map to calculator settings and stage names; stage values are themselves dictionaries:
model: mace-mpa-0
device: cuda
default_dtype: float64
opt:
fmax: 0.05
optimizer: LBFGS
cell_filter: FrechetCellFilter
eq_premelt:
ensemble: NVT
T: 300
steps: 5000
timestep: 0.5
melt:
ensemble: NPT
T_start: 300
T_end: 3000
rate: 100 # K/ps
Only the keys you want to override need to be present — anything you omit falls back to the default.
Example: full melt-quench pipeline
# examples/full_pipeline.yaml
model: mace-mpa-0
device: cuda
default_dtype: float64
opt:
fmax: 0.05
max_steps: 500
optimizer: LBFGS
cell_filter: none
eq_premelt:
ensemble: NVT
T: 300
steps: 10000 # 10 ps at 1 fs
timestep: 0.5
melt:
ensemble: NPT
T_start: 300
T_end: 3000
T_step: 100
rate: 100 # K/ps
timestep: 0.5
eq_high:
ensemble: NVT
T: 3000
steps: 50000 # 50 ps
timestep: 0.5
quench:
ensemble: NVT
T_start: 3000
T_end: 300
rate: 100
timestep: 0.5
eq_low:
ensemble: NVT
T: 300
steps: 10000 # 10 ps
timestep: 0.5
Run with:
amorphgen POSCAR --config full_pipeline.yaml -o my_run/
Example: hybrid (random + quench)
Skip stages 1–3 (already disordered starting structure), anneal at high T, quench:
# examples/hybrid_airss_mq.yaml
model: chgnet
device: cuda
default_dtype: float64
eq_high:
ensemble: NVT
T: 3000
steps: 20000 # 20 ps anneal
timestep: 0.5
friction: 0.01
quench:
ensemble: NVT
T_start: 3000
T_end: 300
rate: 100
timestep: 0.5
friction: 0.01
eq_low:
ensemble: NVT
T: 300
steps: 5000
timestep: 0.5
friction: 0.01
opt:
fmax: 0.05
optimizer: LBFGS
cell_filter: cubic
Run via batch-quench:
amorphgen --batch-quench --snapshot-dir random_inputs/ \
--config hybrid_airss_mq.yaml --batch-stages 4 5 6 7 \
-o hybrid_runs/
Example: validation reference YAML
For --analyse --reference, write a structured reference of expected literature ranges. This adds a match/concern/fail validation table to the analysis output.
# examples/reference_a_Ga2O3.yaml
system: a-Ga2O3
references:
- "Kaewmeechai, Strand & Shluger, Phys. Rev. B 111, 035203 (2025)"
- "Stehlik et al., J. Non-Cryst. Solids 458 (2017) 14"
density:
expected: [4.70, 5.10] # g/cm^3
units: "g/cm^3"
bond_distances:
Ga-O:
expected: [1.85, 1.95] # Angstrom
units: "A"
coordination:
Ga-O:
mean_expected: [4.0, 4.8]
O-Ga:
mean_expected: [2.7, 3.0]
bond_angles:
Ga-O-Ga:
expected: [110.0, 130.0]
units: "deg"
O-Ga-O:
expected: [100.0, 115.0]
units: "deg"
Run with:
amorphgen --analyse --input-dir my_structures/ \
--cutoff auto-rdf \
--reference reference_a_Ga2O3.yaml
Each metric is reported as match (within range), concern (within ~5% of either bound), or fail (outside range). Fast, defensible answer to “do my structures agree with the literature?”
Example: classical potential
Buckingham + Coulomb for SiO₂:
model: buckingham
device: cpu
classical_params:
params:
Si-O: {A: 18003.76, rho: 0.2052, C: 133.54}
O-O: {A: 1388.77, rho: 0.3623, C: 175.0}
charges: {Si: 2.4, O: -1.2}
cutoff: 10.0
alpha: 0.2 # Wolf-summation damping (1/A)
coulomb: true
opt:
fmax: 0.05
optimizer: FIRE
cell_filter: none
Loading YAML in Python
from amorphgen.configs import load_yaml_config
from amorphgen import MeltQuenchPipeline
cfg = load_yaml_config("full_pipeline.yaml")
pipe = MeltQuenchPipeline("POSCAR", cfg_override=cfg)
pipe.run()
load_yaml_config() validates the YAML and emits warnings for unknown keys.
Stage-1 vs stage-7 optimisation: the final_opt fallback
Both Stage 1 (initial crystal opt) and Stage 7 (final amorphous opt) use the structure-optimiser code. By default, both read from the same opt: block. If you want them to differ (e.g. a tighter fmax for the final amorphous structure, or FrechetCellFilter for full cell relaxation while Stage 1 keeps the cell fixed), add a separate final_opt: block:
# Stage 1 — initial crystal opt
opt:
fmax: 0.05
max_steps: 200
optimizer: LBFGS
cell_filter: none # fixed cell for the crystal
# Stage 7 — final amorphous opt (overrides only the keys you specify)
final_opt:
fmax: 0.01 # tighter convergence
max_steps: 500
optimizer: LBFGS
cell_filter: FrechetCellFilter # full cell relax for accurate density
When final_opt: is absent, Stage 7 silently falls back to opt:. This is fine for many workflows but worth knowing if you’re producing publication-quality structures.
Tips
Keep YAMLs in version control. They’re tiny and document your protocol.
Mix YAML + CLI for parameter sweeps: a baseline YAML, with the swept variable on the CLI:
for rate in 50 100 200; do amorphgen POSCAR --config baseline.yaml --quench-rate $rate -o run_${rate}Kps/ done
Comment liberally —
# ...after any value explains why you chose it. Reviewers and future you will thank you.Pre-built examples ship in
examples/:full_pipeline.yaml,hybrid_airss_mq.yaml,fast_test.yaml,reference_a_Ga2O3.yaml.