From sciagent-skills
Parses/creates crystal structures (CIF, POSCAR), queries Materials Project API for DFT properties, analyzes phase/Pourbaix diagrams, computes XRD patterns, generates VASP/QE/CP2K inputs using pymatgen.
npx claudepluginhub jaechang-hits/sciagent-skills --plugin sciagent-skillsThis skill uses the workspace's default tool permissions.
pymatgen is the standard Python library for materials science computation. Its core data model — `Structure` (periodic crystalline materials) and `Molecule` (non-periodic) — provides a unified representation for input/output across 30+ file formats (CIF, POSCAR/CONTCAR, XYZ, PDB, Gaussian, VASP). The library integrates with the Materials Project REST API (`mp_api`) to retrieve 150,000+ DFT-comp...
Manipulates crystal structures (CIF, POSCAR), computes phase diagrams, band structures, DOS, and integrates with Materials Project using Pymatgen Python library for materials science.
Manipulates crystal structures (CIF, POSCAR), computes phase diagrams, band structures, DOS, and integrates Materials Project API for computational materials science workflows.
Computes crystal structure volumes/densities, analyzes coordination bonding/covalency, determines point groups/symmetry, and handles noble gas/lanthanide chemistry queries.
Share bugs, ideas, or general feedback.
pymatgen is the standard Python library for materials science computation. Its core data model — Structure (periodic crystalline materials) and Molecule (non-periodic) — provides a unified representation for input/output across 30+ file formats (CIF, POSCAR/CONTCAR, XYZ, PDB, Gaussian, VASP). The library integrates with the Materials Project REST API (mp_api) to retrieve 150,000+ DFT-computed structures with band gaps, formation energies, and elastic constants. pymatgen is the foundation of the atomate2 and Custodian workflow frameworks for high-throughput DFT.
pymatgen, mp-api (Materials Project client)PMG_MAPI_KEY env varpip install pymatgen mp-api
# Set API key
export PMG_MAPI_KEY="your_api_key_here"
# Or via pymatgen config
python -c "from pymatgen.core import SETTINGS; SETTINGS['PMG_MAPI_KEY'] = 'your_key'"
from pymatgen.core import Structure, Lattice, Species
# Build silicon diamond cubic structure from scratch
a = 5.431 # Angstroms
lattice = Lattice.cubic(a)
silicon = Structure(
lattice=lattice,
species=["Si", "Si"],
coords=[[0, 0, 0], [0.25, 0.25, 0.25]],
)
print(f"Silicon: {silicon.formula}, {silicon.volume:.2f} ų")
print(f"Space group: {silicon.get_space_group_info()}")
# Silicon: Si2, 40.89 ų
# Space group: ('Fd-3m', 227)
Core data structures for periodic crystals.
from pymatgen.core import Structure, Lattice, Element, Species
import numpy as np
# From lattice parameters
lattice = Lattice.from_parameters(a=4.05, b=4.05, c=4.05,
alpha=90, beta=90, gamma=90)
# Build FCC aluminum
al_fcc = Structure(lattice, ["Al", "Al", "Al", "Al"],
[[0, 0, 0], [0.5, 0.5, 0], [0.5, 0, 0.5], [0, 0.5, 0.5]])
print(f"Formula: {al_fcc.formula}")
print(f"Sites: {len(al_fcc)}")
print(f"Volume: {al_fcc.volume:.3f} ų")
print(f"Density: {al_fcc.density:.3f} g/cm³")
# Access sites
for site in al_fcc:
print(f" {site.species_string} at {site.frac_coords}")
# Load from file
from pymatgen.core import Structure
# From CIF (most common exchange format)
struct = Structure.from_file("material.cif")
# From POSCAR (VASP format)
struct_vasp = Structure.from_file("POSCAR")
# Get neighbors within cutoff
site = struct[0]
neighbors = struct.get_neighbors(site, r=3.0)
print(f"Neighbors within 3 Å: {len(neighbors)}")
for nn in neighbors[:3]:
print(f" {nn.species_string}: {nn.nn_distance:.3f} Å")
Retrieve DFT-computed properties for 150,000+ materials.
from mp_api.client import MPRester
import os
api_key = os.environ.get("PMG_MAPI_KEY", "your_key")
with MPRester(api_key) as mpr:
# Search by chemical system
docs = mpr.materials.summary.search(
chemsys=["Li-Fe-O"],
fields=["material_id", "formula_pretty", "energy_above_hull",
"band_gap", "is_stable"]
)
print(f"Li-Fe-O materials: {len(docs)}")
for d in docs[:5]:
print(f" {d.material_id}: {d.formula_pretty}, "
f"Eg={d.band_gap:.2f} eV, above_hull={d.energy_above_hull:.3f} eV/atom")
# Get specific material by MP ID
with MPRester(api_key) as mpr:
doc = mpr.materials.summary.get_data_by_id(
"mp-149", # Silicon
fields=["structure", "band_gap", "formation_energy_per_atom",
"density", "is_stable", "symmetry"]
)
struct = doc.structure
print(f"Si mp-149: band_gap={doc.band_gap:.3f} eV, "
f"density={doc.density:.3f} g/cm³")
print(f"Space group: {doc.symmetry.symbol}")
from pymatgen.symmetry.analyzer import SpacegroupAnalyzer
from pymatgen.core import Structure
struct = Structure.from_file("material.cif")
# Symmetry analysis
sga = SpacegroupAnalyzer(struct, symprec=0.1)
print(f"Space group: {sga.get_space_group_symbol()} ({sga.get_space_group_number()})")
print(f"Crystal system: {sga.get_crystal_system()}")
print(f"Point group: {sga.get_point_group_symbol()}")
# Get conventional / primitive cell
primitive = sga.get_primitive_standard_structure()
conventional = sga.get_conventional_standard_structure()
print(f"Primitive: {len(primitive)} sites | Conventional: {len(conventional)} sites")
# Wyckoff positions
sym_dataset = sga.get_symmetry_dataset()
print(f"Wyckoff letters: {set(sym_dataset['wyckoffs'])}")
Thermodynamic stability and phase boundary analysis.
from pymatgen.analysis.phase_diagram import PhaseDiagram, PDPlotter
from mp_api.client import MPRester
import os
api_key = os.environ.get("PMG_MAPI_KEY", "your_key")
with MPRester(api_key) as mpr:
# Get all entries in the Li-Fe-P-O chemical system
entries = mpr.get_pourbaix_entries(["Li", "Fe"])
# For phase diagram, use computed entries
with MPRester(api_key) as mpr:
entries = mpr.get_entries_in_chemsys(["Li", "Fe", "O"])
pd = PhaseDiagram(entries)
print(f"Stable phases: {len(pd.stable_entries)}")
# Check stability of a specific composition
from pymatgen.core import Composition
comp = Composition("LiFeO2")
e_hull = pd.get_e_above_hull(pd.qhull_entries[0])
print(f"E above hull: {e_hull:.3f} eV/atom")
# Plot (requires matplotlib)
plotter = PDPlotter(pd, backend="matplotlib")
plotter.show()
from pymatgen.analysis.diffraction.xrd import XRDCalculator
from pymatgen.core import Structure
import matplotlib.pyplot as plt
struct = Structure.from_file("material.cif") # or build programmatically
# Calculate XRD pattern (Cu Kα radiation, λ = 1.5406 Å)
calculator = XRDCalculator(wavelength="CuKa")
pattern = calculator.get_pattern(struct, two_theta_range=(10, 80))
print(f"Diffraction peaks: {len(pattern.x)}")
for two_theta, intensity, hkl in zip(pattern.x[:5], pattern.y[:5], pattern.hkls[:5]):
print(f" 2θ={two_theta:.2f}°, I={intensity:.1f}, hkl={hkl}")
# Plot
fig, ax = plt.subplots(figsize=(10, 4))
ax.bar(pattern.x, pattern.y, width=0.3, color="black")
ax.set_xlabel("2θ (degrees)")
ax.set_ylabel("Intensity (arb. units)")
ax.set_title(f"XRD Pattern — {struct.formula}")
plt.tight_layout()
plt.savefig("xrd_pattern.pdf", bbox_inches="tight")
Generate VASP input sets for DFT calculations.
from pymatgen.io.vasp.sets import MPRelaxSet, MPStaticSet
from pymatgen.core import Structure
struct = Structure.from_file("material.cif")
# Generate VASP relaxation input set (Materials Project standard)
relax_set = MPRelaxSet(struct)
# Write to directory
import os
os.makedirs("vasp_relax", exist_ok=True)
relax_set.write_input("vasp_relax")
print("Generated: POSCAR, INCAR, KPOINTS, POTCAR (requires VASP pseudopotentials)")
# Inspect key INCAR settings
incar = relax_set.incar
print(f"ENCUT: {incar.get('ENCUT')} eV")
print(f"KPOINTS: {relax_set.kpoints}")
# For static calculation after relaxation
static_set = MPStaticSet.from_prev_calc(prev_calc_dir="vasp_relax")
static_set.write_input("vasp_static")
pymatgen Structure stores atomic positions in fractional coordinates (relative to lattice vectors, range 0–1). Convert to/from Cartesian (Angstroms) using struct.lattice.get_cartesian_coords(frac) or struct.lattice.get_fractional_coords(cart). Most file formats use Cartesian; pymatgen converts automatically on read/write.
Composition("LiFePO4") parses chemical formulas. Structure.add_oxidation_state_by_guess() uses bond valence to assign formal charges (+Li, -O, etc.) needed for Pourbaix diagrams and some property calculations.
from mp_api.client import MPRester
from pymatgen.analysis.phase_diagram import PhaseDiagram
import pandas as pd, os
api_key = os.environ.get("PMG_MAPI_KEY", "your_key")
# Screen lithium-transition-metal oxides for stability
systems = [f"Li-{m}-O" for m in ["Mn", "Co", "Ni", "Fe", "V"]]
results = []
with MPRester(api_key) as mpr:
for system in systems:
docs = mpr.materials.summary.search(
chemsys=[system],
fields=["material_id", "formula_pretty", "energy_above_hull",
"band_gap", "is_stable", "formation_energy_per_atom"]
)
for d in docs:
results.append({
"system": system,
"mpid": d.material_id,
"formula": d.formula_pretty,
"e_above_hull": d.energy_above_hull,
"band_gap": d.band_gap,
"stable": d.is_stable,
})
df = pd.DataFrame(results)
stable = df[df["stable"] == True].sort_values("band_gap")
print(f"Stable phases: {len(stable)}/{len(df)}")
print(stable[["formula", "system", "band_gap", "e_above_hull"]].head(10))
stable.to_csv("stability_screen.csv", index=False)
from pymatgen.core import Structure
from pymatgen.transformations.standard_transformations import (
SupercellTransformation, SubstitutionTransformation
)
# Load and analyze structure
struct = Structure.from_file("material.cif")
print(f"Original: {struct.formula}, {len(struct)} sites")
# Create 2×2×2 supercell
sc_matrix = [[2, 0, 0], [0, 2, 0], [0, 0, 2]]
supercell = SupercellTransformation(sc_matrix).apply_transformation(struct)
print(f"Supercell: {len(supercell)} sites")
# Substitute element (e.g., 10% Fe doping on Mn sites)
sub = SubstitutionTransformation({"Mn": {"Mn": 0.9, "Fe": 0.1}})
doped = sub.apply_transformation(struct)
print(f"Doped composition: {doped.composition.reduced_formula}")
# Export in multiple formats
struct.to(filename="output.cif") # CIF
struct.to(filename="POSCAR") # VASP POSCAR
struct.to(filename="output.xyz") # XYZ
print("Exported CIF, POSCAR, XYZ")
| Parameter | Module/Function | Default | Range / Options | Effect |
|---|---|---|---|---|
symprec | SpacegroupAnalyzer | 0.01 | 0.01–0.5 Å | Symmetry detection tolerance; larger = more permissive |
wavelength | XRDCalculator | "CuKa" | "CuKa", "MoKa", float (Å) | X-ray wavelength for diffraction simulation |
two_theta_range | XRDCalculator.get_pattern | (0, 90) | tuple of degrees | Angular range for XRD pattern |
ENCUT | MPRelaxSet INCAR | 520 eV | 300–800 eV | Plane-wave energy cutoff for VASP |
chemsys | MPRester.search | — | ["Li-Fe-O"] | Chemical system filter for Materials Project query |
fields | MPRester.search | all | list of strings | Limit returned fields to reduce API transfer |
r | Structure.get_neighbors | — | 1–8 Å | Neighbor search cutoff radius |
from pymatgen.core import Structure
from pathlib import Path
cif_dir = Path("cif_files")
poscar_dir = Path("poscar_files")
poscar_dir.mkdir(exist_ok=True)
for cif_path in cif_dir.glob("*.cif"):
try:
struct = Structure.from_file(str(cif_path))
out_path = poscar_dir / f"{cif_path.stem}_POSCAR"
struct.to(filename=str(out_path))
print(f"Converted: {cif_path.name} → {out_path.name}")
except Exception as e:
print(f"FAILED {cif_path.name}: {e}")
from mp_api.client import MPRester
import pandas as pd, os
mp_ids = ["mp-149", "mp-2815", "mp-1265"] # Si, GaAs, TiO2
api_key = os.environ.get("PMG_MAPI_KEY", "your_key")
with MPRester(api_key) as mpr:
docs = mpr.materials.summary.get_data_by_ids(
mp_ids, fields=["material_id", "formula_pretty", "band_gap", "is_gap_direct"]
)
df = pd.DataFrame([{
"mpid": d.material_id,
"formula": d.formula_pretty,
"band_gap_eV": d.band_gap,
"direct_gap": d.is_gap_direct,
} for d in docs])
print(df.to_string(index=False))
| Problem | Cause | Solution |
|---|---|---|
APIError: Invalid API key | PMG_MAPI_KEY not set or expired | Set env var: export PMG_MAPI_KEY="your_key"; regenerate key at materialsproject.org |
SpacegroupAnalyzer returns wrong space group | Atom positions have small disorder; symprec too tight | Increase symprec from 0.01 to 0.1; use get_refined_structure() first |
Structure.from_file fails for CIF | Disorder, partial occupancies, or non-standard CIF | Use CifParser(file).get_structures(primitive=False); set occupancy_tolerance=2.0 |
MPRester hangs or times out | Large query returning thousands of results | Add fields=["material_id", "formula_pretty"] to reduce payload; use chunk_size |
| POTCAR not found in VASP input set | VASP pseudopotential library not configured | Run pmg config --add PMG_VASP_PSP_DIR /path/to/potpaw |
| Memory error on large supercell | Supercell has thousands of atoms | Reduce supercell size; use Structure.get_sorted_structure() then write incrementally |
XRDCalculator gives zero peaks | Structure has only 1 site or all-same species | Ensure multi-site structure; check that structure loaded correctly with print(struct) |
pymoo — multi-objective optimization for materials property screening using pymatgen descriptorsautodock-vina-docking — structure preparation workflow analogous to pymatgen for molecular dockingzarr-python — efficient storage of large arrays from MD trajectories or property databases