Utilities
User-facing helper functions used across the pipeline.
Common helpers
MD-dynamics builder, trajectory I/O, config merging, density helper
(amorphgen.utils.common).
amorphgen.utils.common
Shared helpers used across all pipeline stages: cell manipulation, MD dynamics builder, temperature ramps, logging, trajectory I/O, config merging, and snapshot extraction.
Calculator-related functions are in amorphgen.utils.calculators.
- amorphgen.utils.common.compute_density_gcm3(atoms)[source]
Compute density of an Atoms object in g/cm3.
- Return type:
- amorphgen.utils.common.make_cubic(atoms)[source]
Reshape the cell to a cube of equal volume, rescaling atom positions.
- amorphgen.utils.common.build_md_dynamics(atoms, ensemble='NVT', T=300.0, timestep=1.0, friction=0.01, ttime=25.0, pfactor=None, external_stress=0.0, **kwargs)[source]
Create an NVT or NPT ASE dynamics object.
- Parameters:
atoms (ase.Atoms) – Must already have a calculator attached.
ensemble (str) –
"NVT"or"NPT".T (float) – Temperature in Kelvin.
timestep (float) – Time step in fs.
friction (float) – Langevin friction coefficient (for NVT).
ttime (float) – Thermostat time constant in fs (for NPT Nose-Hoover).
pfactor (float, optional) – Barostat coupling (for NPT). If None, auto-calculated.
external_stress (float) – External pressure in GPa (for NPT).
**kwargs – Extra arguments forwarded to the ASE dynamics class.
- Return type:
ASE dynamics object
- amorphgen.utils.common.resolve_ramp(T_start, T_end, T_step)[source]
Generate a list of temperatures for a ramp.
Works for both heating (T_step > 0) and cooling (T_step < 0). Always includes T_end.
- class amorphgen.utils.common.MDLogger(logfile, mode='w')[source]
Bases:
objectPer-step MD logger that writes to both a file and stdout.
Logs step number, time (ps), temperature (K), potential energy (eV), kinetic energy (eV), total energy (eV), and volume (ų).
- class amorphgen.utils.common.TrajectoryWriter(filename, fmt='extxyz')[source]
Bases:
objectUnified trajectory output supporting multiple formats.
Wraps ASE’s write() for extxyz/xyz/lammps-dump and ASE’s Trajectory for .traj binary format.
- amorphgen.utils.common.attach_outputs(dyn, atoms, logfile, trajfile, fmt='extxyz', interval=100)[source]
Attach an MDLogger and TrajectoryWriter to dyn.
Returns (logger, traj_writer) so they can be closed later.
- amorphgen.utils.common.merge_config(defaults, overrides)[source]
Deep-merge overrides into a copy of defaults.
- amorphgen.utils.common.extract_snapshots(traj_file, n_snapshots=20, select='uniform', output_dir='snapshots', burn_in_frames=0, output_format='xyz')[source]
Extract snapshot frames from a trajectory file.
- Parameters:
traj_file (str) – Path to the trajectory file.
n_snapshots (int) – Number of snapshots to extract.
select (str) – Selection strategy:
"uniform"(evenly spaced) or"last"(final n frames).output_dir (str) – Directory for output files.
burn_in_frames (int, default 0) – Number of leading frames to discard before sampling. Useful for skipping the non-equilibrated portion of an MD trajectory (e.g. the first ~50 ps of a 100 ps high-T equilibration). The reported frame indices in the output filenames remain absolute (relative to the original trajectory) so the provenance of each snapshot is preserved.
output_format (str, default
"xyz") – File format for written snapshots. One ofxyz/extxyz(extxyz with full metadata),vasp(POSCAR, sorted by species), orcif. All formats are ASE-readable round-trip.
- Returns:
Paths to extracted snapshot files.
- Return type:
Radii, classification and auto-derivation
Shannon ionic / Cordero covalent / Goldschmidt metallic radii, bonding-type
classification, automatic minsep + density + target-CN derivation, and
charge-balance oxidation-state inference (amorphgen.utils.radii).
These helpers are exercised end-to-end in Tutorial 2
(“Zero-config random structure generation”).
amorphgen.utils.radii
Atomic radii data, bonding classification, minsep calculation, and density estimation for random structure generation.
This module contains: - Shannon ionic radii (CN=4 and CN=6) - Metallic radii (Goldschmidt CN=12) - Pauling electronegativities - Element classification (nonmetal, metalloid) - Elemental solid densities - Bond type classification - Minsep calculation from radii - Cell volume / density estimation
- amorphgen.utils.radii.infer_oxidation_state(sym, composition)[source]
Infer the oxidation state of sym in composition by charge balance.
Sums anion charge from
ANION_CHARGESand assigns the remaining positive charge equally across cations of the queried element when only one cation type is present, or for any element that appears as the only non-anion type. ReturnsNonewhen the inference is ambiguous (multiple cation species, non-integer oxidation state, or no anions at all), letting the caller fall back to a default Shannon entry.
- amorphgen.utils.radii.auto_target_cn(composition)[source]
Auto-detect target coordination numbers and tolerance from composition.
Returns (target_cn, cn_tolerance) tuple.
Rules: - Metalloids (Si, Ge, B): CN=4, tolerance=0 (strict tetrahedral) - Pure Si/Ge: CN=4, tolerance=0 - Metals in nitrides: CN=4, tolerance=0 (wurtzite tetrahedral) - Metals in halides/sulfides: CN=6, tolerance=0 (strict octahedral) - Metals in oxides: CN=5, tolerance=1 (flexible 4-6 range)
- amorphgen.utils.radii.classify_bond(sym_a, sym_b)[source]
Classify a pair of elements by bonding type.
Uses an element-type classification rule (nonmetal / metalloid / metal membership) with a Pauling-electronegativity refinement: pairs that the type rules would call “ionic” but whose Pauling electronegativity difference is small (Δχ < 1.0) are reclassified as “covalent”. This catches III–V semiconductors (GaAs Δχ=0.37, AlAs ~0.6) and related compounds where the chemistry is predominantly covalent and Shannon ionic radii give unrealistically small inter-atomic distances.
Returns one of: “ionic”, “covalent”, “metallic”.
- amorphgen.utils.radii.get_ionic_radius(sym, cn=None, oxidation_state=None)[source]
Get Shannon ionic radius for an element.
- Parameters:
sym (str) – Element symbol.
cn (int, optional) – Coordination number. If provided, use CN-specific radius. Defaults to CN=6.
oxidation_state (int, optional) – Specific oxidation state to look up (e.g.
5for Sb in Sb_2O_5). If supplied and present in the Shannon table the matching radius is returned. If supplied but absent, falls through to the default selection. WhenNone(default), anions return their most-negative state and cations return their highest positive state - usually the right choice for binary oxides / halides / nitrides where the higher state dominates (TiO_2, V_2O_5, WO_3 etc.).
- Return type:
float or None
- amorphgen.utils.radii.get_metallic_radius(sym)[source]
Get metallic radius for an element. Returns None if unavailable.
- amorphgen.utils.radii.get_effective_radius(sym)[source]
Get the most appropriate radius for volume estimation.
Metals -> metallic radius
Anion-formers (O, S, Cl, …) -> Shannon ionic radius
Metalloids (Si, Ge, …) -> metallic radius if available
Fallback -> covalent radius
- amorphgen.utils.radii.default_minsep(symbols, scale=0.85, target_cn=None)[source]
Build a minsep dict for all element pairs.
Uses bonding-type-aware radii. When target_cn is provided, uses CN-specific Shannon radii (e.g. CN=4 for Si in SiO2).
- amorphgen.utils.radii.estimate_density(composition, amorphous_factor=0.8)[source]
Estimate density from elemental solid densities.
Returns density in g/cm3, or None if any element lacks density data.
- amorphgen.utils.radii.estimate_cell_length(composition, target_density=None, packing_factor=None, density_scale=1.0)[source]
Estimate cubic cell length from composition via sphere packing.
A single class-aware sphere-packing model handles all amorphous compositions:
\[V_{\text{cell}} = \frac{\sum_i N_i \cdot \tfrac{4}{3}\pi r_i^3} {f_{\text{pack}} \cdot s},\]where \(r_i\) is the radius for element \(i\) chosen according to the compound’s bonding regime (Shannon ionic for ionic compounds, Cordero covalent for covalent compounds, Goldschmidt metallic for alloys; see
_radius_for_density()). \(f_{\text{pack}}\) is the class-aware packing fraction stored inPACKING_FACTORS, and \(s\) is the user-supplieddensity_scale(default 1.0).If
target_densityis supplied, the cell is sized directly from it anddensity_scale/packing_factorare ignored.
- amorphgen.utils.radii.format_auto_derive_summary(composition, target_cn, minsep, est_density, cell_length)[source]
Single-line summary of the auto-derivation chain for the random_gen log.
Captures the chemistry-informed decisions AmorphGen makes from a bare composition: material class, oxidation states (where inferable), per-cation target coordination, per-pair bond classification + Pauling Δχ + minsep, and the final estimated density / cell length. Designed to fit on one log line for typical 2–4 element systems (e.g. Ga₂O₃ ≈ 200 chars).
- Format:
- [auto-derive] <formula> → <class>, OS{…}, CN{…},
minsep{pair:val class [Δχ=val] | …}, ρ=<val> g/cm³, L=<val> Å
For ionic pairs (where the Pauling Δχ < 1.0 refinement fires) the Δχ value is shown so the user can see exactly why the pair was classified as ionic vs covalent. For metallic and anion-packing pairs the Δχ is omitted (the type rule alone decides those).
Format conversion
Format-conversion helper used by amorphgen --convert and the convert-mode
YAML config (amorphgen.utils.convert).
amorphgen.utils.convert
Public file-format conversion utility.
Convert one structure file or every ASE-readable file in a directory to a target format (xyz / vasp / cif). VASP outputs are sorted by species so the resulting POSCAR is clean.
Usable from CLI (amorphgen --convert PATH --format vasp), Python
API (from amorphgen import convert), or via a convert: block
in a YAML config file (amorphgen --config convert.yaml).
- amorphgen.utils.convert.convert(input_path, output_format='vasp', output_dir=None, sort=True, verbose=True)[source]
Convert a structure file or directory of files to
output_format.- Parameters:
input_path (str) – Path to a single ASE-readable structure file, or to a directory containing one or more such files. Directory globs match
*.xyz,*.extxyz,*.vasp,*.cifandPOSCAR*.output_format (str) – Target format key. Allowed values:
"xyz"/"extxyz"(both write ASE extended XYZ to.xyz),"vasp"(POSCAR to.vasp),"cif".output_dir (str, optional) – Directory to write converted files into. If
None, defaults to"<input>_<format>"for directory inputs, or the parent directory ofinput_pathfor single-file inputs.sort (bool, default True) – For VASP outputs, sort atoms by species so the POSCAR is clean. Ignored for other formats.
verbose (bool, default True) – Print one progress line per file plus a summary footer.
- Returns:
Paths to the converted output files, in input order.
- Return type:
Examples
>>> from amorphgen import convert >>> convert("snapshots/", output_format="vasp", ... output_dir="snapshots_vasp/") ['snapshots_vasp/snapshot_0000_frame00000.vasp', ...]
MD equilibration convergence analysis
Functions for assessing whether an MD equilibration stage has converged
(amorphgen.utils.equilibration). Useful as a post-pipeline quality
check: did the high-T equilibration in your melt-quench actually reach
steady-state, or did you quench from a still-thermalising liquid?
The high-level entry point is convergence_report, which generates
energy-vs-time, block-averaging, MSD, temperature, and time-window-RDF
plots in one call. Lower-level functions are exposed for users who
want to assemble custom diagnostics.
amorphgen.utils.equilibration
Equilibration convergence analysis for AmorphGen MD trajectories.
- Provides functions to assess whether an MD equilibration run has converged:
Energy vs time (running average + drift detection)
Block averaging (quantitative equilibration test)
RDF in time windows (structural convergence)
Mean Square Displacement (diffusion / liquid vs glass)
Coordination number vs time
Temperature vs time
Usage
from amorphgen.utils.equilibration import convergence_report
# Quick all-in-one convergence report report = convergence_report(
“stage4_eq.xyz”, timestep_fs=1.0, T_target=3000,
)
# Or individual analyses fig_energy, drift = plot_energy_convergence(traj, timestep_fs=1.0) is_eq, block_data = block_average_test(traj, n_blocks=4) fig_msd, D_dict = plot_msd(traj, timestep_fs=1.0)
Notes
Two input modes: trajectory file (.xyz/.traj) or log file (.log). Log files are faster (no atomic positions to load) but only support energy, temperature, and block-average analyses. Trajectory files are needed for MSD, RDF, and CN analyses.
Default timestep is 1.0 fs, matching AmorphGen’s default_config.py.
- amorphgen.utils.equilibration.parse_md_log(logfile)[source]
Parse an AmorphGen MD stage log file.
- Expects whitespace-separated columns:
Step Time(ps) T(K) Epot(eV) Ekin(eV) Etot(eV) Vol(A^3)
Lines starting with ‘Step’, ‘-’, or ‘->’ are skipped as headers/markers.
- amorphgen.utils.equilibration.running_average(data, window)[source]
Compute running average with given window size.
- Parameters:
data (numpy.ndarray)
window (int)
- Return type:
- amorphgen.utils.equilibration.extract_energies(source, n_atoms=None)[source]
Extract potential energies (eV) and per-atom energies.
- Parameters:
- Returns:
energies (np.ndarray — total potential energy (eV))
energies_per_atom (np.ndarray — energy per atom (eV/atom))
- Return type:
- amorphgen.utils.equilibration.plot_energy_convergence(source, timestep_fs=1.0, window_ps=0.5, per_atom=True, n_atoms=None, ax=None)[source]
Plot potential energy vs time with running average and linear drift.
- Parameters:
- Returns:
fig (matplotlib Figure)
drift_eV_per_ps (float) – Linear drift in energy. Should be ~0 if equilibrated.
- amorphgen.utils.equilibration.block_average_test(source, n_blocks=4, discard_fraction=0.1, n_atoms=None)[source]
Split trajectory into blocks and compare mean energies.
A system is considered equilibrated if the block means agree within 2x the standard error of the mean (SEM).
- Parameters:
- Returns:
is_equilibrated (bool)
block_data (dict) – Keys: block_means, overall_mean, overall_std, sem, max_deviation, threshold, is_equilibrated.
- Return type:
- amorphgen.utils.equilibration.plot_block_averages(source, n_blocks=4, discard_fraction=0.1, timestep_fs=1.0, n_atoms=None, ax=None)[source]
Visualise block averaging: block means vs overall mean +/- 2*SEM.
- amorphgen.utils.equilibration.compute_msd(traj, timestep_fs=1.0, by_element=True)[source]
Compute MSD from trajectory using unwrapped positions.
Handles both orthorhombic and non-orthorhombic cells via fractional coordinate unwrapping.
- Parameters:
- Returns:
time_ps (np.ndarray)
msd_dict (dict — keys are element symbols (+ “all”),) – values are MSD arrays (A^2).
- Return type:
- amorphgen.utils.equilibration.plot_msd(traj, timestep_fs=1.0, ax=None)[source]
Plot MSD vs time per element. Fits diffusion coefficient D.
D is fitted from the linear regime (last 50% of trajectory) using MSD = 6*D*t (3D diffusion).
- Returns:
fig (matplotlib Figure)
D_dict (dict — diffusion coefficients (cm^2/s) per element.) – D > 1e-6 cm^2/s indicates liquid/diffusive behaviour. D < 1e-6 cm^2/s indicates frozen/glass behaviour.
- Parameters:
timestep_fs (float)
- amorphgen.utils.equilibration.plot_temperature(source, timestep_fs=1.0, T_target=None, ax=None)[source]
Plot instantaneous temperature vs time.
- amorphgen.utils.equilibration.plot_rdf_time_windows(traj, pairs=None, n_windows=4, rmax=4.0, nbins=100, timestep_fs=1.0)[source]
Overlay partial RDFs from different time windows.
If the RDFs from all windows overlap, the structure is equilibrated.
- amorphgen.utils.equilibration.compute_cn_vs_time(traj, centre, neighbour, cutoff=None, window_size=50, timestep_fs=1.0)[source]
Compute average coordination number vs time using a sliding window.
- Parameters:
- Return type:
- amorphgen.utils.equilibration.plot_cn_vs_time(traj, pairs, cutoffs=None, window_size=50, timestep_fs=1.0, ax=None)[source]
Plot CN vs time for multiple centre-neighbour pairs.
- amorphgen.utils.equilibration.convergence_report(source, timestep_fs=1.0, T_target=None, n_atoms=None, pairs_rdf=None, pairs_cn=None, cn_cutoffs=None, rmax=4.0, n_blocks=4, output_dir=None, prefix='convergence')[source]
Generate a comprehensive convergence report.
- Parameters:
source (str, list of Atoms, or Trajectory) – Trajectory file (.xyz/.traj), log file (.log), or list of Atoms. Log files are faster but only provide energy/temperature (no MSD, RDF, or CN analysis).
timestep_fs (float) – MD timestep in femtoseconds (default 1.0).
T_target (float, optional) – Target temperature (K) for temperature check.
n_atoms (int, optional) – Required if source is a .log file.
pairs_rdf (list of (str, str), optional) – Element pairs for RDF windows. Auto-detected if None.
pairs_cn (list of (str, str, float), optional) – Element pairs + expected CN for CN tracking. e.g. [(“Si”, “O”, 4.0), (“O”, “Si”, 2.0)]
cn_cutoffs (list of float, optional) – Bond cutoffs for CN pairs. None = auto from covalent radii.
rmax (float) – Max distance for RDF (must not exceed half cell length).
n_blocks (int) – Number of blocks for block average test.
output_dir (str, optional) – If provided, save all plots as PNG and a text report here.
prefix (str) – Filename prefix for saved plots.
- Returns:
report – Keys include: n_frames, n_atoms, total_time_ps, elements, energy_drift_eV_per_atom_per_ps, block_test_passed, block_data, diffusion_coefficients_cm2_s, summary_text.
- Return type:
Other utility modules
Calculators — multi-backend calculator factory (
get_calculator).amorphgen.utils.classical— Lennard-Jones and Buckingham+Coulomb calculators; loaded viaget_calculator("lennard-jones" | "buckingham", ...).