From curry-train
Compose multiple microbatch forward/backward passes into a single optimizer step, enabling effective batch sizes larger than memory permits. Activate when the user asks "gradient accumulation", "accumulate gradients", "effective batch size", "OOM at larger batch", or asks how to set the GA factor.
npx claudepluginhub curryfromuestc/curry-train --plugin curry-trainThis skill uses the workspace's default tool permissions.
Microbatch schedule for one optimizer step. The only fully-implemented primitive in V1; serves as the model for the others.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Processes PDFs: extracts text/tables/images, merges/splits/rotates pages, adds watermarks, creates/fills forms, encrypts/decrypts, OCRs scans. Activates on PDF mentions or output requests.
Share bugs, ideas, or general feedback.
Microbatch schedule for one optimizer step. The only fully-implemented primitive in V1; serves as the model for the others.
Splits a target effective batch size into K microbatches. Per microbatch:
zero_grad only on microbatch 0.1/K to keep gradient magnitude consistent.step_optimizer only on microbatch K-1.from curry_train.primitives import GradientAccumulation
ga = GradientAccumulation(steps=K)
ga.steps # K
ga.enabled # K > 1
ga.is_first(i), ga.is_last(i) # microstep i is first/last
ga.train_step_kwargs(i) # dict of {zero_grad, sync_gradients,
# step_optimizer, loss_divisor,
# collect_metrics}
ga.from_config(cfg, key="gradient_accumulation_steps") # build from Hydra
sync_gradients flag is what train_step_kwargs(i) produces.loss_divisor must be applied consistently. Forgetting it produces gradients K× too large.V1: fully implemented at template/curry_train/primitives/grad_accum.py.
template/curry_train/benchmark.py:run_accumulated_step — uses this primitive.template/curry_train/loop.py:run_training_steps — uses this primitive.skills/bench — bench command exercises GA.