From aradotso-trending-skills-37
Reconstructs 3D scenes from multi-view images/videos and generates worlds from text/single images using Tencent's HY-World 2.0 Python pipelines and WorldMirror.
npx claudepluginhub joshuarweaver/cascade-ai-ml-agents-misc-1 --plugin aradotso-trending-skills-37This skill uses the workspace's default tool permissions.
> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.
Guides Next.js Cache Components and Partial Prerendering (PPR) with cacheComponents enabled. Implements 'use cache', cacheLife(), cacheTag(), revalidateTag(), static/dynamic optimization, and cache debugging.
Guides building MCP servers enabling LLMs to interact with external services via tools. Covers best practices, TypeScript/Node (MCP SDK), Python (FastMCP).
Generates original PNG/PDF visual art via design philosophy manifestos for posters, graphics, and static designs on user request.
Skill by ara.so — Daily 2026 Skills collection.
HY-World 2.0 is a multi-modal world model by Tencent Hunyuan that reconstructs, generates, and simulates 3D worlds. It accepts text, single-view images, multi-view images, and videos as input and produces 3D representations (meshes, 3D Gaussian Splattings, point clouds). Two core capabilities:
# 1. Clone repository
git clone https://github.com/Tencent-Hunyuan/HY-World-2.0
cd HY-World-2.0
# 2. Create conda environment
conda create -n hyworld2 python=3.10
conda activate hyworld2
# 3. Install PyTorch with CUDA 12.4
pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124
# 4. Install project dependencies
pip install -r requirements.txt
# 5a. Install FlashAttention-3 (recommended for performance)
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention/hopper
python setup.py install
cd ../../
rm -rf flash-attention
# 5b. OR install FlashAttention-2 (simpler)
pip install flash-attn --no-build-isolation
Model weights are automatically downloaded from Hugging Face on first run. Alternatively, download manually:
| Model | HuggingFace |
|---|---|
| WorldMirror 2.0 | tencent/HY-World-2.0 → HY-WorldMirror-2.0 |
| WorldMirror 1.0 (legacy) | tencent/HunyuanWorld-Mirror |
To pre-download:
# Set HuggingFace cache directory if needed
export HF_HOME=/path/to/cache
pip install huggingface_hub
python -c "from huggingface_hub import snapshot_download; snapshot_download('tencent/HY-World-2.0')"
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
# Load pipeline — weights auto-downloaded on first run
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
# Run reconstruction from a folder of images
result = pipeline('path/to/images')
Provide known camera parameters or depth priors to improve accuracy:
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline(
'path/to/images',
prior_cam_path='path/to/prior_camera.json',
prior_depth_path='path/to/prior_depth.npy', # optional
)
The prior_camera.json format expected by the pipeline:
[
{
"image": "frame_001.jpg",
"fx": 800.0,
"fy": 800.0,
"cx": 640.0,
"cy": 360.0,
"width": 1280,
"height": 720,
"c2w": [
[1.0, 0.0, 0.0, 0.0],
[0.0, 1.0, 0.0, 0.0],
[0.0, 0.0, 1.0, 0.0],
[0.0, 0.0, 0.0, 1.0]
]
}
]
The pipeline returns a result object with the following attributes:
result = pipeline('path/to/images')
# Access outputs
point_cloud = result.point_cloud # 3D point cloud (numpy or torch)
depth_maps = result.depth_maps # Per-image depth maps
normals = result.normals # Surface normal maps
cameras = result.cameras # Predicted camera parameters
gaussians = result.gaussians # 3DGS attributes
# Save outputs
result.save('output_dir/') # Saves all outputs to directory
Launch an interactive web UI for 3D reconstruction:
# From project root
python -m hyworld2.worldrecon.app
# Or if a dedicated script exists
python app.py --model tencent/HY-World-2.0
Access at http://localhost:7860 by default.
Extract frames from a video, then run reconstruction:
import cv2
import os
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
def extract_frames(video_path, output_dir, fps=2):
os.makedirs(output_dir, exist_ok=True)
cap = cv2.VideoCapture(video_path)
video_fps = cap.get(cv2.CAP_PROP_FPS)
frame_interval = int(video_fps / fps)
frame_idx = 0
saved = 0
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
if frame_idx % frame_interval == 0:
cv2.imwrite(f"{output_dir}/frame_{saved:04d}.jpg", frame)
saved += 1
frame_idx += 1
cap.release()
return output_dir
# Extract frames at 2 fps
frames_dir = extract_frames("scene.mp4", "frames/", fps=2)
# Run reconstruction
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline(frames_dir)
result.save("output_3d/")
WorldMirror 2.0 supports 50K–500K pixel resolution. Control via resize parameters:
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
# Low resolution (fast, lower memory)
result_fast = pipeline(
'path/to/images',
resolution=512, # resize shorter edge to 512
)
# High resolution (slower, more detail)
result_hq = pipeline(
'path/to/images',
resolution=1024,
)
import os
from pathlib import Path
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
scenes_root = Path("scenes/")
output_root = Path("outputs/")
for scene_dir in sorted(scenes_root.iterdir()):
if not scene_dir.is_dir():
continue
out_dir = output_root / scene_dir.name
out_dir.mkdir(parents=True, exist_ok=True)
print(f"Processing: {scene_dir.name}")
try:
result = pipeline(str(scene_dir))
result.save(str(out_dir))
print(f" Saved to {out_dir}")
except Exception as e:
print(f" Failed: {e}")
After reconstruction, export to formats compatible with Blender / Unity / Unreal:
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
pipeline = WorldMirrorPipeline.from_pretrained('tencent/HY-World-2.0')
result = pipeline('path/to/images')
# Save 3DGS (.ply format for tools like 3D Gaussian Splatting viewer)
result.save_gaussians("scene.ply")
# Save mesh (if mesh export is supported)
result.save_mesh("scene.obj") # or scene.glb
# Save point cloud
result.save_pointcloud("scene_pointcloud.ply")
For large scenes or limited VRAM:
import torch
from hyworld2.worldrecon.pipeline import WorldMirrorPipeline
# Load in fp16 to reduce memory
pipeline = WorldMirrorPipeline.from_pretrained(
'tencent/HY-World-2.0',
torch_dtype=torch.float16,
)
pipeline = pipeline.to('cuda')
# Run with lower resolution to fit in memory
result = pipeline('path/to/images', resolution=768)
# Free memory after use
del result
torch.cuda.empty_cache()
HY-World-2.0/
├── hyworld2/
│ ├── worldrecon/ # WorldMirror 2.0 reconstruction
│ │ ├── pipeline.py # Main WorldMirrorPipeline class
│ │ ├── app.py # Gradio web app
│ │ └── ...
│ ├── worldgen/ # World generation (coming soon)
│ │ ├── panorama/ # HY-Pano 2.0
│ │ ├── nav/ # WorldNav trajectory planning
│ │ └── stereo/ # WorldStereo 2.0
│ └── utils/
├── assets/ # Demo assets
├── requirements.txt
└── README.md
# HuggingFace model cache location
export HF_HOME=/path/to/hf/cache
# HuggingFace token (if accessing private/gated models)
export HUGGING_FACE_HUB_TOKEN=your_token_here
# CUDA device selection
export CUDA_VISIBLE_DEVICES=0
# For multi-GPU setups
export CUDA_VISIBLE_DEVICES=0,1
# Use FlashAttention-2 as fallback
pip install flash-attn --no-build-isolation
# If that fails, disable flash attention (slower but works)
# Set environment variable before running
export USE_FLASH_ATTENTION=0
# 1. Reduce resolution
result = pipeline('path/to/images', resolution=512)
# 2. Use fp16
pipeline = WorldMirrorPipeline.from_pretrained(
'tencent/HY-World-2.0',
torch_dtype=torch.float16
)
# 3. Process fewer images at once — use a subset
import os
images = sorted(os.listdir('path/to/images'))[:10] # limit to 10 frames
# Use HF mirror if huggingface.co is blocked
export HF_ENDPOINT=https://hf-mirror.com
# Or manually download and point to local path
pipeline = WorldMirrorPipeline.from_pretrained('/local/path/to/model')
# Verify versions match
python -c "import torch; print(torch.__version__, torch.version.cuda)"
# Should output: 2.4.0 12.4
# Reinstall if mismatch
pip install torch==2.4.0 torchvision==0.19.0 --index-url https://download.pytorch.org/whl/cu124
# Ensure images are valid and in supported formats (.jpg, .png)
from PIL import Image
import os
img_dir = 'path/to/images'
for f in os.listdir(img_dir):
try:
img = Image.open(os.path.join(img_dir, f))
img.verify()
except Exception as e:
print(f"Bad image {f}: {e}")
| Project | Use Case | Link |
|---|---|---|
| WorldStereo | Panorama → 3DGS (open-source preview of WorldStereo-2) | GitHub |
| HunyuanWorld 1.0 | Panorama generation (interim for HY-Pano 2.0) | GitHub |
| WorldMirror 1.0 | Legacy reconstruction model | HuggingFace |