Structure analysis

The amorphgen.analysis module provides ensemble structural analysis for amorphous structure files: pair distribution functions, coordination numbers, bond angles, ring statistics, Voronoi metrics, energy ranking, and validation against literature reference ranges.

The CLI entry point is amorphgen --analyse <FILE_OR_DIR>; the Python API is the StructureAnalyser class.

StructureAnalyser

class amorphgen.analysis.StructureAnalyser(source, cutoff='auto')[source]

Bases: object

Analyse amorphous structures: density, coordination, distances, angles, RDF, S(q), rings, Voronoi, energy ranking.

Parameters:
  • source (str or list of str) – Path to a structure file, a directory, or a list of file paths.

  • cutoff (float, dict, or str) –

    • float: single cutoff (A) for all pairs

    • dict: pair-specific, e.g. {“Si-O”: 2.2}

    • ”auto”: from bonding radii (fast)

    • ”auto-rdf”: from RDF first minimum (accurate)

__init__(source, cutoff='auto')[source]
Parameters:
  • source (str, list of str, or list of Atoms) – Path to a structure file, a directory of structure files, a list of file paths, or a list of ASE Atoms objects.

  • cutoff (float, dict, or str) – Neighbour cutoff for CN, bond distances, and angles. - float: single cutoff (A) for all pairs - dict: pair-specific, e.g. {“Si-O”: 2.2, “O-O”: 2.8} - “auto”: from bonding radii (fast, default) - “auto-rdf”: from RDF first minimum (more accurate)

density()[source]

Compute density (g/cm3) for each structure.

Returns:

{“values”: list[float], “mean”: float, “std”: float}

Return type:

dict

coordination(pair=None)[source]

Compute coordination numbers across all structures.

Parameters:

pair (str, optional) – Pair to analyse, e.g. “Si-O”. If None, all pairs computed.

Returns:

Keyed by pair string. Each value is a dict with “mean”, “std”, “min”, “max”, “distribution”, “total_atoms”.

Return type:

dict

bond_distances(pair=None)[source]

Compute bond distance statistics.

Parameters:

pair (str, optional) – Pair to analyse, e.g. “Si-O”. If None, all pairs computed.

Returns:

Keyed by pair string. Each value is a dict with “mean”, “std”, “min”, “max”, “count”.

Return type:

dict

bond_angles(triplet=None)[source]

Compute bond angle statistics.

Parameters:

triplet (str, optional) – Triplet to analyse, e.g. “O-Si-O”. If None, all triplets computed.

Returns:

Keyed by triplet string. Each value is a dict with “mean”, “std”, “min”, “max”, “count” (angles in degrees).

Return type:

dict

rdf(pair=None, rmax=None, nbins=200, sigma=0.0)[source]

Compute the radial distribution function g(r).

Parameters:
  • pair (str, optional) – Pair to analyse, e.g. “Si-O”. If None, total RDF.

  • rmax (float, optional) – Maximum radius in A. Auto-detected from cell if None.

  • nbins (int) – Number of histogram bins (default 200).

  • sigma (float) – Gaussian smearing width in A (default 0.0 = no smearing). Use 0.02-0.05 for comparison with experiment.

Returns:

{“r”: list[float], “g_r”: list[float]}

Return type:

dict

structure_factor(pair=None, qmax=15.0, nq=300, rmax=None)[source]

Compute the structure factor S(q) from g(r) via Fourier transform.

Parameters:
  • pair (str, optional) – Pair to analyse. If None, total S(q).

  • qmax (float) – Maximum q in 1/A (default 15.0).

  • nq (int) – Number of q points (default 300).

  • rmax (float, optional) – Maximum radius for RDF used in the transform.

Returns:

{“q”: list[float], “s_q”: list[float]}

Return type:

dict

averaged_rdf(pair=None, rmax=None, nbins=200)[source]

Compute RDF per structure with mean and standard deviation.

Parameters:
  • pair (str, optional) – Pair to analyse, e.g. “Si-O”. If None, total RDF.

  • rmax (float, optional) – Maximum radius in A. Auto-detected from cell if None.

  • nbins (int) – Number of histogram bins (default 200).

Returns:

{“r”: list, “g_r_mean”: list, “g_r_std”: list, “n_structures”: int}

Return type:

dict

ring_statistics(bond_pair=None, cutoff=None, max_ring=12)[source]

Compute ring size statistics for network-forming structures.

Parameters:
  • bond_pair (tuple of str, optional) – Bond pair to trace, e.g. (“Si”, “O”). Auto-detected if None.

  • cutoff (float or dict, optional) – Bond cutoff. Uses analyser cutoff if None.

  • max_ring (int) – Maximum ring size to search (default 12).

Returns:

{“ring_sizes”: list, “counts”: list, “fractions”: list,

”bond_pair”: tuple, “total_rings”: int}

Return type:

dict

voronoi(element=None)[source]

Compute Voronoi tessellation indices (n3, n4, n5, n6).

Parameters:

element (str, optional) – Element to analyse. If None, all atoms included.

Returns:

{“indices”: list, “distribution”: dict, “top_10”: list,

”mean_faces”: float, “total_atoms”: int}

Return type:

dict

energy_ranking()[source]

Rank structures by potential energy per atom.

Reads energy from atoms.info or attached calculator.

Returns:

{“energies_per_atom”: dict, “ranking”: list, “best”: int,

”worst”: int, “best_energy”: float, “worst_energy”: float, “spread”: float}. Returns “warning” key if no energy data.

Return type:

dict

averaged_cn(pair=None)[source]

Compute per-structure mean CN with overall mean and std.

Parameters:

pair (str, optional) – Pair to analyse, e.g. “Si-O”. If None, all pairs computed.

Returns:

Keyed by pair string. Each value is a dict with “mean_per_structure”, “overall_mean”, “overall_std”, “n_structures”.

Return type:

dict

summary(show_angles=True)[source]
per_structure_summary()[source]

Analyse each structure individually and produce a comparison table.

Return type:

str — formatted table (also printed).

save_report(filepath, text=None, show_angles=True)[source]

Save the summary report to a text file.

Parameters:
  • filepath (str) – Output file path.

  • text (str, optional) – Pre-computed report text. If None, calls summary().

  • show_angles (bool) – Include bond angles in the report (default True).

plot(**kwargs)[source]

Generate and save analysis plots (RDF, CN distribution, bond angles).

Parameters:
  • output_dir (str) – Directory for output files (default “.”).

  • prefix (str) – Filename prefix (default “analysis”).

  • rdf_pairs (list of str, optional) – Pairs to plot, e.g. [“Si-O”, “O-O”]. If None, all pairs.

  • rmax (float, optional) – Maximum radius for RDF. Auto-detected if None.

  • save_csv (bool) – Also save raw data as CSV files (default True).

Reference-validation helpers

For comparing computed metrics against literature ranges (used by amorphgen --analyse --reference REF.yaml):

Validate computed structural metrics against literature reference ranges.

A reference YAML lists expected ranges for density, bond distances, mean coordination numbers, and bond angle means. Each metric is compared to the analyser’s computed value and labelled “match” / “concern” / “fail” so the user can defend an ensemble against published data.

amorphgen.analysis.validate.validate_against_reference(analyser, reference)[source]

Compare analyser output to a reference dict (loaded from YAML).

Parameters:
  • analyser (StructureAnalyser)

  • reference (dict) – Parsed YAML with optional keys: density, bond_distances, coordination, bond_angles (see examples/reference_*.yaml).

Returns:

{"system": str, "sources": list[str], "rows": list[tuple]} where each row is (descriptor, computed, expected_lo, expected_hi, units, verdict).

Return type:

dict

amorphgen.analysis.validate.format_validation_report(result)[source]

Render the dict from validate_against_reference() as a printable table.

Energy ranking helpers

For parsing random_gen.log and ranking generated structures by total energy (used by amorphgen --rank-from-log LOG):

Energy ranking for multiple structures.

amorphgen.analysis.energy.compute_energy_ranking(atoms_list)[source]

Rank structures by potential energy.

Reads energy from atoms.info or calculator.

amorphgen.analysis.energy.rank_from_log(logfile)[source]

Parse a random-gen log file and rank structures by total energy.

The relax loop in batch_random() prints final energy on the last optimizer step row. This function reads those rows directly, so energy ranking works for VASP/CIF outputs that don’t store energy.

Parameters:

logfile (str) – Path to random_gen.log.

Returns:

{"rows": [(idx, energy, e_per_atom, fmax, n_steps, status), ...] sorted by e_per_atom ascending, "n_atoms": int, "best": idx, "worst": idx, "spread_meV_per_atom": float}.

Return type:

dict

amorphgen.analysis.energy.format_log_ranking(result, logfile=None)[source]

Render the dict from rank_from_log() as a printable table.

Submodule reference

StructureAnalyser delegates to focused submodules; advanced users can import these directly:

Submodule

Provides

analysis.rdf

Pair distribution function g(r), partial RDFs

analysis.structure

Coordination numbers, bond distances, bond angles

analysis.rings

Ring statistics (King’s shortest-path)

analysis.voronoi

Voronoi cell volumes and connectivity

analysis.energy

Total-energy parsing and ranking

analysis.cutoff

Bond-cutoff selection from g(r) first minimum

analysis.plotting

Publication-quality matplotlib helpers

analysis.validate

Reference-YAML validation