From curry-train
Bidirectional weight conversion between HuggingFace transformers format and curryTrain internal format, including offline fallback when the HF Hub is unreachable. Activate when the user asks "load HF weights", "HuggingFace bridge", "convert weights", "HF Hub unreachable", or "offline weight loading".
npx claudepluginhub curryfromuestc/curry-train --plugin curry-trainThis skill uses the workspace's default tool permissions.
Bidirectional translation between HuggingFace `transformers` weight format and the model's internal layout, plus an offline path for when `huggingface.co` is unreachable.
Guides Next.js Cache Components and Partial Prerendering (PPR): 'use cache' directives, cacheLife(), cacheTag(), revalidateTag() for caching, invalidation, static/dynamic optimization. Auto-activates on cacheComponents: true.
Processes PDFs: extracts text/tables/images, merges/splits/rotates pages, adds watermarks, creates/fills forms, encrypts/decrypts, OCRs scans. Activates on PDF mentions or output requests.
Share bugs, ideas, or general feedback.
Bidirectional translation between HuggingFace transformers weight format and the model's internal layout, plus an offline path for when huggingface.co is unreachable.
config.json, model.safetensors, tokenizer files).from curry_train.primitives import HFBridge
bridge = HFBridge(curry_model_class=MyModel, hf_arch="LlamaForCausalLM")
# Online: pull from HF Hub
state_dict = bridge.import_from_hf("Qwen/Qwen2.5-1.5B")
# Offline: pull from a local snapshot
state_dict = bridge.import_from_hf("/path/to/local/Qwen2.5-1.5B")
# Export back
bridge.export_to_hf(model, output_dir="./my-model", tokenizer=tokenizer)
When the HF Hub call fails (network, firewall, or service outage), HFBridge should fall back to a clear guided procedure:
Detect the failure and categorize: NetworkError, AuthError, ModelNotFound.
Print a numbered set of recovery steps:
HF Hub unreachable. To proceed offline:
1. From a machine with internet, download the model:
huggingface-cli download Qwen/Qwen2.5-1.5B \
--local-dir ./Qwen2.5-1.5B \
--local-dir-use-symlinks False
2. Copy ./Qwen2.5-1.5B to this machine.
3. Re-run with the local path:
bridge.import_from_hf("/abs/path/to/Qwen2.5-1.5B")
For details on minimum files needed (config.json, *.safetensors,
tokenizer.json), see skill primitive-hf-bridge.
Do not silently fall through to a partial download or a stale cache. Halt and ask.
Minimum:
config.jsonmodel.safetensors (or sharded model-NNNNN-of-NNNNN.safetensors + model.safetensors.index.json)tokenizer.json (preferred) or tokenizer.model + tokenizer_config.json.Optional:
generation_config.jsonspecial_tokens_map.jsonThe mapping table from HF names to curryTrain internal names lives in curry_train/models/<name>/checkpoint.py. It is per model because HF naming is non-uniform across architectures.
Typical Llama-style mapping:
# HF name -> curryTrain internal name
"model.embed_tokens.weight" -> "tok_emb.weight"
"model.layers.{i}.self_attn.q_proj.weight" -> "blocks.{i}.attn.q.weight"
"model.layers.{i}.self_attn.k_proj.weight" -> "blocks.{i}.attn.k.weight"
"model.layers.{i}.self_attn.v_proj.weight" -> "blocks.{i}.attn.v.weight"
"model.layers.{i}.self_attn.o_proj.weight" -> "blocks.{i}.attn.o.weight"
"model.layers.{i}.mlp.gate_proj.weight" -> "blocks.{i}.mlp.w1.weight"
"model.layers.{i}.mlp.up_proj.weight" -> "blocks.{i}.mlp.w3.weight"
"model.layers.{i}.mlp.down_proj.weight" -> "blocks.{i}.mlp.w2.weight"
"model.layers.{i}.input_layernorm.weight" -> "blocks.{i}.attn_norm.weight"
"model.layers.{i}.post_attention_layernorm.weight" -> "blocks.{i}.mlp_norm.weight"
"model.norm.weight" -> "final_norm.weight"
"lm_head.weight" -> "output_head.weight"
V1: stub at template/curry_train/primitives/hf_bridge.py. Reference: HuggingFace transformers.Auto* + safetensors.
skills/new-experiment — calls bridge when --from=<hf-path> is provided.skills/primitive-distributed-optimizer — interaction with sharded loading.huggingface-cli download.