Help us improve
Share bugs, ideas, or general feedback.
Organizes PyTorch code into LightningModules, configures Trainers for multi-GPU/TPU, builds data pipelines and callbacks, and runs distributed training (DDP, FSDP, DeepSpeed). Use when structuring training loops or scaling neural-network training.
npx claudepluginhub alterlab-ieu/alterlab-academic-skills --plugin alterlab-writing-toolsHow this skill is triggered — by the user, by Claude, or both
Slash command
/alterlab-writing-tools:alterlab-pytorch-lightningThis skill is limited to the following tools:
The summary Claude sees in its skill listing — used to decide when to auto-load this skill
PyTorch Lightning is a deep learning framework that organizes PyTorch code to eliminate boilerplate while maintaining full flexibility. Automate training workflows, multi-device orchestration, and implement best practices for neural network training and scaling across multiple GPUs/TPUs.
Organizes PyTorch code into LightningModules, configures Trainers for multi-GPU/TPU training, implements data pipelines, callbacks, and distributed strategies (DDP, FSDP, DeepSpeed).
Provides PyTorch patterns and best practices for building robust, efficient, and reproducible training pipelines, model architectures, and data loading.
Presents PyTorch development patterns for device-agnostic code, reproducibility, shape management, and clean model architecture. Useful for writing or reviewing PyTorch training pipelines.
Share bugs, ideas, or general feedback.
PyTorch Lightning is a deep learning framework that organizes PyTorch code to eliminate boilerplate while maintaining full flexibility. Automate training workflows, multi-device orchestration, and implement best practices for neural network training and scaling across multiple GPUs/TPUs.
This skill should be used when:
Organize PyTorch models into six logical sections:
__init__() and setup()training_step(batch, batch_idx)validation_step(batch, batch_idx)test_step(batch, batch_idx)predict_step(batch, batch_idx)configure_optimizers()Quick template reference: See scripts/template_lightning_module.py for a complete boilerplate.
Detailed documentation: Read references/lightning_module.md for comprehensive method documentation, hooks, properties, and best practices.
The Trainer automates the training loop, device management, gradient operations, and callbacks. Key features:
Quick setup reference: See scripts/quick_trainer_setup.py for common Trainer configurations.
Detailed documentation: Read references/trainer.md for all parameters, methods, and configuration options.
Encapsulate all data processing steps in a reusable class:
prepare_data() - Download and process data (single-process)setup() - Create datasets and apply transforms (per-GPU)train_dataloader() - Return training DataLoaderval_dataloader() - Return validation DataLoadertest_dataloader() - Return test DataLoaderQuick template reference: See scripts/template_datamodule.py for a complete boilerplate.
Detailed documentation: Read references/data_module.md for method details and usage patterns.
Add custom functionality at specific training hooks without modifying your LightningModule. Built-in callbacks include:
Detailed documentation: Read references/callbacks.md for built-in callbacks and custom callback creation.
Integrate with multiple logging platforms:
Log metrics using self.log("metric_name", value) in any LightningModule method.
Detailed documentation: Read references/logging.md for logger setup and configuration.
Choose the right strategy based on model size:
Configure with: Trainer(strategy="ddp", accelerator="gpu", devices=4)
Detailed documentation: Read references/distributed_training.md for strategy comparison and configuration.
self.device instead of .cuda()self.save_hyperparameters() in __init__()self.log() for automatic aggregation across devicesseed_everything() and Trainer(deterministic=True)Trainer(fast_dev_run=True) to test with 1 batchDetailed documentation: Read references/best_practices.md for common patterns and pitfalls.
Define model:
class MyModel(L.LightningModule):
def __init__(self):
super().__init__()
self.save_hyperparameters()
self.model = YourNetwork()
def training_step(self, batch, batch_idx):
x, y = batch
loss = F.cross_entropy(self.model(x), y)
self.log("train_loss", loss)
return loss
def configure_optimizers(self):
return torch.optim.Adam(self.parameters())
Prepare data:
# Option 1: Direct DataLoaders
train_loader = DataLoader(train_dataset, batch_size=32)
# Option 2: LightningDataModule (recommended for reusability)
dm = MyDataModule(batch_size=32)
Train:
trainer = L.Trainer(max_epochs=10, accelerator="gpu", devices=2)
trainer.fit(model, train_loader) # or trainer.fit(model, datamodule=dm)
Executable Python templates for common PyTorch Lightning patterns:
template_lightning_module.py - Complete LightningModule boilerplatetemplate_datamodule.py - Complete LightningDataModule boilerplatequick_trainer_setup.py - Common Trainer configuration examplesDetailed documentation for each PyTorch Lightning component:
lightning_module.md - Comprehensive LightningModule guide (methods, hooks, properties)trainer.md - Trainer configuration and parametersdata_module.md - LightningDataModule patterns and methodscallbacks.md - Built-in and custom callbackslogging.md - Logger integrations and usagedistributed_training.md - DDP, FSDP, DeepSpeed comparison and setupbest_practices.md - Common patterns, tips, and pitfalls