From ai4ss-skills
Diagnoses and optimizes R code performance using profvis and bench. Covers vectorization, preallocation, parallelization with future, and generates performance comparison reports.
How this skill is triggered — by the user, by Claude, or both
Slash command
/ai4ss-skills:r-performanceThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
```
1. 定位瓶颈 → 2. 诊断原因 → 3. 选择策略 → 4. 实施优化
(profvis) (理解机制) (决策树) (代码修改)
Rscript --vanilla <script.R>,不要依赖交互式环境里残留的对象。永远先 profile,不要猜测!
# 方法1:profvis 可视化(推荐)
library(profvis)
profvis({
# 你的代码
result <- slow_function(data)
})
# 方法2:bench 精确测量
library(bench)
bench::mark(
method_a = function_a(x),
method_b = function_b(x),
check = FALSE # 如果结果不完全相同
)
profvis 输出解读:
详细用法见 references/profiling.md
找到瓶颈后,判断属于哪类问题:
| 症状 | 可能原因 | 解决方向 |
|---|---|---|
| 循环内大量内存分配 | Growing objects | 预分配 + 向量化 |
| 简单操作却很慢 | Copy-on-modify | 避免不必要的复制 |
| CPU 单核 100% | 单线程瓶颈 | 并行化 |
| 大量重复计算 | 缺少缓存 | Memoization |
| 外部 I/O 等待 | I/O 瓶颈 | 异步/批量处理 |
R 底层机制详解见 references/r-internals.md
代码慢?
│
├─ 是否可向量化?
│ ├─ 是 → 用向量化操作替代循环
│ └─ 否 → 继续判断
│
├─ 迭代之间是否独立?
│ ├─ 是 → 并行化(future 生态)
│ │ ├─ 本地多核 → plan(multisession)
│ │ └─ HPC 集群 → plan(batchtools_slurm)
│ └─ 否 → 继续判断
│
├─ 是否有大量内存分配?
│ ├─ 是 → 预分配 + 避免 copy-on-modify
│ └─ 否 → 继续判断
│
└─ 是否是计算密集型?
├─ 是 → 考虑算法优化或使用已优化的包
└─ 否 → 检查 I/O 和外部调用
# 慢 ❌
result <- c()
for (i in 1:n) {
result <- c(result, x[i] * 2) # 每次都复制整个向量!
}
# 快 ✓
result <- x * 2 # 向量化操作
# 慢 ❌
result <- c()
for (i in 1:n) {
result <- c(result, compute(i))
}
# 快 ✓
result <- vector("list", n) # 预分配
for (i in 1:n) {
result[[i]] <- compute(i)
}
# 或使用 lapply
result <- lapply(1:n, compute)
# 慢 ❌ - 每次修改都会复制
for (i in 1:n) {
df$new_col[i] <- compute(df$x[i])
}
# 快 ✓ - 向量化赋值
df$new_col <- sapply(df$x, compute)
# 或
df$new_col <- purrr::map_dbl(df$x, compute)
library(future)
library(future.apply)
# 设置并行后端
plan(multisession, workers = parallelly::availableCores() - 1)
# 并行 lapply
result <- future_lapply(1:n, function(i) {
slow_computation(i)
})
# 完成后恢复顺序执行
plan(sequential)
library(future)
library(mice)
# 设置并行后端
plan(multisession, workers = 4)
# futuremice 自动利用 future 后端
imp <- futuremice(data, m = 20, parallelseed = 123)
plan(sequential)
完成优化后,总是生成一个 HTML 性能对比报告,方便研究者审查改动并向合作者展示效果。
python3 <skill-dir>/scripts/perf_html.py before.R after.R [perf-summary.json] [output.html]
输入:
before.R — 优化前的 R 代码(可以是单个函数、一段 pipeline、或完整脚本)after.R — 优化后的 R 代码perf-summary.json (可选) — 性能指标摘要:{
"runtime_before": "4.2s",
"runtime_after": "0.34s",
"speedup": "12.4",
"memory_before": "812 MB",
"memory_after": "94 MB",
"mem_reduction": "88",
"note": "Vectorized inner loop; eliminated redundant join"
}
输出 perf-report.html(默认)— 单文件 HTML,无外部依赖:
note 字段)何时生成报告:
研究者通常想把这个 HTML 直接发给合作者,不要让用户自己去找脚本路径 — 默认就跑,输出路径告诉用户。
根据需要查阅:
npx claudepluginhub siyaozheng/ai4ss-skills --plugin ai4ss-skillsR performance best practices including profiling, benchmarking, vctrs, and optimization strategies. Use when optimizing R code.
Modern R operations for data analysis, statistics, and reproducible work. Use for: R, Rstats, tidyverse, dplyr, tidyr, ggplot2, the native pipe |>, tibbles, data wrangling (filter/mutate/summarise/group_by/across/joins/pivot), reading and writing data (readr, readxl, arrow/Parquet, DBI/dbplyr databases, data.table::fread, rvest scraping), strings (stringr) and regex, dates/times (lubridate), factors (forcats), iteration and functional programming (purrr map family, list-columns), statistics and modeling (t.test/lm/glm, formulas, broom, tidymodels), high-performance data.table, time series (tsibble/fable, zoo/xts), and project workflow (renv, Quarto, here, testthat, styler, RStudio/Posit Projects). Covers tidyverse-first idioms with base R and data.table as named alternatives.
Autonomously optimizes code performance using CodSpeed benchmarks, flamegraph analysis, and iterative improvement. Activates on optimization requests, slow functions, or regression mentions.