Skill

ksim-rl

From asi

Provides RL training for humanoid locomotion and manipulation using PPO, AMP tasks on MuJoCo and JAX. Enables sim-to-real robotics policy development.

Python

ai-ml

npx claudepluginhub plurigrid/asi --plugin asi

Tool Access

This skill uses the workspace's default tool permissions.

Preview

**Trit**: -1 (MINUS - analysis/verification)

SKILL.md

Similar Skills

kscale

Indexes K-Scale Labs robotics skills for humanoid robot development, RL training, sim-to-real transfer, and deployment. Organizes 9 skills in GF(3) triadic structure.

3 files

asi

stable-baselines3

Guides training RL agents with Stable Baselines3 algorithms (PPO, SAC, DQN, TD3, A2C) using Gymnasium environments, custom env creation, callbacks, and optimization.

7 files

superpowers

stable-baselines3

Guides training RL agents with Stable Baselines3 (PPO, SAC, DQN, TD3, DDPG, A2C), custom Gym environments, callbacks for monitoring, vectorized envs for parallel training, and deep RL workflows.

7 files

scientific-skills

Stats

Parent Repo Stars16

Parent Repo Forks5

Last CommitFeb 16, 2026

Actions

View Source View Plugin View on GitHub View README

Help us improve

Share bugs, ideas, or general feedback.

KSIM-RL Skill

Trit: -1 (MINUS - analysis/verification) Color: #3A2F9E (Deep Purple) URI: skill://ksim-rl#3A2F9E

Overview

KSIM is K-Scale Labs' reinforcement learning library for humanoid robot locomotion and manipulation. Built on MuJoCo for physics simulation and JAX for hardware-accelerated training.

Core Architecture

┌─────────────────────────────────────────────────────────────────┐
│                        KSIM ARCHITECTURE                        │
├─────────────────────────────────────────────────────────────────┤
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────────────────┐  │
│  │  RLTask     │  │  PPOTask    │  │  AMPTask                │  │
│  │  (abstract) │──│  (PPO impl) │──│  (Adversarial Motion)   │  │
│  └─────────────┘  └─────────────┘  └─────────────────────────┘  │
│         │                                                        │
│         ▼                                                        │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │                    PhysicsEngine                             │ │
│  │  ┌───────────────┐  ┌───────────────────────────────┐       │ │
│  │  │ MujocoEngine  │  │ MjxEngine (JAX-accelerated)   │       │ │
│  │  └───────────────┘  └───────────────────────────────┘       │ │
│  └─────────────────────────────────────────────────────────────┘ │
│         │                                                        │
│         ▼                                                        │
│  ┌─────────────────────────────────────────────────────────────┐ │
│  │  Environment Components                                      │ │
│  │  • Actuators: Position, Velocity, Torque control            │ │
│  │  • Observations: Joint states, IMU, local view              │ │
│  │  • Rewards: Velocity tracking, gait, energy, stability      │ │
│  │  • Terminations: Fall detection, boundary violations        │ │
│  └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

Key Features

JAX-Accelerated: Uses MJX for parallel environment simulation on GPU/TPU
PPO Training: Proximal Policy Optimization with configurable hyperparameters
AMP Support: Adversarial Motion Priors for realistic humanoid locomotion
Modular Rewards: Composable reward functions for gait, velocity, energy
Domain Randomization: Built-in randomizers for sim-to-real transfer

API Usage

import ksim
from ksim import PPOTask, MjxEngine
from ksim.tasks.humanoid import HumanoidWalkingTask

# Define custom task
class KBotWalkingTask(PPOTask):
    model_path = "kbot.mjcf"
    
    # Observations
    observations = [
        ksim.JointPosition(),
        ksim.JointVelocity(),
        ksim.IMUAngularVelocity(),
        ksim.BaseOrientation(),
    ]
    
    # Rewards
    rewards = [
        ksim.LinearVelocityReward(scale=1.0),
        ksim.GaitPhaseReward(scale=0.5),
        ksim.EnergyPenalty(scale=-0.01),
    ]
    
    # Actuators
    actuators = [
        ksim.PositionActuator(
            joint_name=".*",
            kp=100.0,
            kd=10.0,
            action_scale=0.5,
        )
    ]

# Train
task = KBotWalkingTask()
task.run_training(
    num_envs=4096,
    num_steps=1000000,
    learning_rate=3e-4,
)

GF(3) Triads

This skill participates in balanced triads:

ksim-rl (-1) ⊗ kos-firmware (+1) ⊗ mujoco-scenes (0) = 0 ✓
ksim-rl (-1) ⊗ kos-firmware (+1) ⊗ urdf2mjcf (0) = needs balancing

Key Contributors

codekansas (Ben Bolte): Core architecture, PPO, rewards
b-vm: Randomizers, disturbances, policy training
carlosdp: Adaptive KL, action scaling
WT-MM: Visualization, markers

Related Skills

kos-firmware (+1): Robot firmware and gRPC services
mujoco-scenes (0): Scene composition for MuJoCo
evla-vla (-1): Vision-language-action models
urdf2mjcf (-1): URDF to MJCF conversion
ktune-sim2real (-1): Servo tuning for sim2real

References

@misc{ksim2024,
  title={K-Sim: RL Training for Humanoid Locomotion},
  author={K-Scale Labs},
  year={2024},
  url={https://github.com/kscalelabs/ksim}
}