Skill

tooluniverse-immune-repertoire-analysis

Analyzes TCR/BCR repertoire sequencing data for clonality, diversity, V(D)J gene usage, CDR3 characteristics, convergence, epitope specificity prediction, and single-cell integration. For immunology and immunotherapy research.

ai-ml

data-engineering

Install

npx claudepluginhub joshuarweaver/cascade-data-analytics --plugin mims-harvard-tooluniverse

Tool Access

This skill uses the workspace's default tool permissions.

Preview

Comprehensive skill for analyzing T-cell receptor (TCR) and B-cell receptor (BCR) repertoire sequencing data to characterize adaptive immune responses, clonal expansion, and antigen specificity.

Supporting Assets

ANALYSIS_DETAILS.mdUSE_CASES.md

SKILL.md

Similar Skills

github-deep-research

2 files

Conducts multi-round deep research on GitHub repos via API and web searches, generating markdown reports with executive summaries, timelines, metrics, and Mermaid diagrams.

bytedance-deer-flow-1

63.9k

surprise-me

Dynamically discovers and combines enabled skills into cohesive, unexpected delightful experiences like interactive HTML or themed artifacts. Activates on 'surprise me', inspiration, or boredom cues.

bytedance-deer-flow-1

63.9k

image-generation

2 files

Generates images from structured JSON prompts via Python script execution. Supports reference images and aspect ratios for characters, scenes, products, visuals.

bytedance-deer-flow-1

63.9k

Stats

Stars1291

Forks199

Last CommitMar 29, 2026

Actions

View Source View Plugin View on GitHub View README

ToolUniverse Immune Repertoire Analysis

Comprehensive skill for analyzing T-cell receptor (TCR) and B-cell receptor (BCR) repertoire sequencing data to characterize adaptive immune responses, clonal expansion, and antigen specificity.

Domain Reasoning

Repertoire diversity reflects immune history. High clonality — a few clones dominating — indicates antigen-driven expansion, as seen in active infection, tumor-infiltrating lymphocytes, or chronic stimulation. Low diversity points to immunodeficiency or treatment-induced lymphopenia. Always compare observed metrics against healthy donor reference distributions before drawing conclusions; a Shannon entropy of 7 is unremarkable in a healthy adult but alarming post-chemotherapy.

LOOK UP DON'T GUESS

Clonotype frequency thresholds, CDR3 length ranges, and convergence ratios: query IEDB or VDJdb; do not assume values from memory.
Epitope specificities for expanded clones: search iedb_search_tcell_assays and BVBRC_search_epitopes; never infer antigen identity from CDR3 alone.
V gene family usage biases in healthy donors: retrieve published reference data or query ImmPort; do not assume baseline distributions are uniform.
Sequencing depth adequacy: compute rarefaction curves from the actual data; do not guess whether depth is sufficient.

Overview

Adaptive immune receptor repertoire sequencing (AIRR-seq) enables comprehensive profiling of T-cell and B-cell populations through high-throughput sequencing of TCR and BCR variable regions. This skill provides an 8-phase workflow for:

Clonotype identification and tracking
Diversity and clonality assessment
V(D)J gene usage analysis
CDR3 sequence characterization
Clonal expansion and convergence detection
Epitope specificity prediction
Integration with single-cell phenotyping
Longitudinal repertoire tracking

Core Workflow

Phase 1: Data Import & Clonotype Definition

Load AIRR-seq data from common formats (MiXCR, ImmunoSEQ, AIRR standard, 10x Genomics VDJ). Standardize columns to: cloneId, count, frequency, cdr3aa, cdr3nt, v_gene, j_gene, chain. Define clonotypes using one of three methods:

cdr3aa: Amino acid CDR3 sequence only
cdr3nt: Nucleotide CDR3 sequence
vj_cdr3: V gene + J gene + CDR3aa (most common, recommended)

Aggregate by clonotype, sort by count, assign ranks.

Phase 2: Diversity & Clonality Analysis

Calculate diversity metrics for the repertoire:

Shannon entropy: Overall diversity (higher = more diverse)
Simpson index: Probability two random clones are same
Inverse Simpson: Effective number of clonotypes
Gini coefficient: Inequality in clonotype distribution
Clonality: 1 - Pielou's evenness (higher = more clonal)
Richness: Number of unique clonotypes

Generate rarefaction curves to assess whether sequencing depth is sufficient.

Phase 3: V(D)J Gene Usage Analysis

Analyze V and J gene usage patterns weighted by clonotype count:

V gene family usage frequencies
J gene family usage frequencies
V-J pairing frequencies
Statistical testing for biased usage (chi-square test vs. uniform expectation)

Phase 4: CDR3 Sequence Analysis

Characterize CDR3 sequences:

Length distribution: Typical TCR CDR3 = 12-18 aa; BCR CDR3 = 10-20 aa
Amino acid composition: Weighted by clonotype frequency
Flag unusual length distributions (may indicate PCR bias)

Phase 5: Clonal Expansion Detection

Identify expanded clonotypes above a frequency threshold (default: 95th percentile). Track clonotypes longitudinally across multiple timepoints to measure persistence, mean/max frequency, and fold changes.

Phase 6: Convergence & Public Clonotypes

Convergent recombination: Same CDR3 amino acid from different nucleotide sequences (evidence of antigen-driven selection)
Public clonotypes: Shared across multiple samples/individuals (may indicate common antigen responses)

Phase 7: Epitope Prediction & Specificity

Query epitope databases for known TCR-epitope associations:

IEDB (iedb_search_tcell_assays): Search T-cell assay records by sequence or MHC class; use iedb_search_epitopes with sequence_contains for motif search
BVBRC (BVBRC_search_epitopes): Best for organism-based epitope discovery (e.g., taxon_id="2697049" for SARS-CoV-2); returns epitope sequences with T-cell/B-cell assay counts
VDJdb (manual): https://vdjdb.cdr3.net/search
PubMed literature (PubMed_search_articles): Search for CDR3 + epitope/antigen/specificity
IEDB detail tools: iedb_get_epitope_antigens (link epitope→antigen), iedb_get_epitope_mhc (MHC restriction)

Phase 8: Integration with Single-Cell Data

Link TCR/BCR clonotypes to cell phenotypes from paired single-cell RNA-seq:

Map clonotypes to cell barcodes
Identify expanded clonotype phenotypes on UMAP
Analyze clonotype-cluster associations (cross-tabulation)
Find cluster-specific clonotypes (>80% cells in one cluster)
Differential gene expression: expanded vs. non-expanded cells

ToolUniverse Tool Integration

Key Tools Used:

iedb_search_tcell_assays - T-cell assay records (sequence, MHC class filters)
iedb_search_bcell - B-cell assay records
iedb_search_epitopes - Epitope motif search via sequence_contains
BVBRC_search_epitopes - Organism-based epitope discovery (best for pathogen-specific queries)
NCBI_SRA_search_runs - Find public TCR/BCR-seq datasets (use strategy="AMPLICON")
ImmPort_search_studies - NIAID immunology studies (vaccine trials, flow cytometry)
PubMed_search_articles - Literature on TCR/BCR specificity
UniProt_get_entry_by_accession - Antigen protein information

Integration with Other Skills:

tooluniverse-single-cell - Single-cell transcriptomics
tooluniverse-rnaseq-deseq2 - Bulk RNA-seq analysis
tooluniverse-variant-analysis - Somatic hypermutation analysis (BCR)

Quick Start

from tooluniverse import ToolUniverse

# 1. Load data
tcr_data = load_airr_data("clonotypes.txt", format='mixcr')

# 2. Define clonotypes
clonotypes = define_clonotypes(tcr_data, method='vj_cdr3')

# 3. Calculate diversity
diversity = calculate_diversity(clonotypes['count'])
print(f"Shannon entropy: {diversity['shannon_entropy']:.2f}")

# 4. Detect expanded clones
expansion = detect_expanded_clones(clonotypes)
print(f"Expanded clonotypes: {expansion['n_expanded']}")

# 5. Analyze V(D)J usage
vdj_usage = analyze_vdj_usage(tcr_data)

# 6. Query epitope databases
top_clones = expansion['expanded_clonotypes']['clonotype'].head(10)
epitopes = query_epitope_database(top_clones)

Reasoning Framework for Result Interpretation

Evidence Grading

Grade	Criteria	Example
Strong	Clonal expansion > 1% frequency, convergent recombination confirmed, epitope match in IEDB/VDJdb	CDR3 at 5% frequency with 3 nucleotide variants encoding same amino acid, IEDB hit
Moderate	Expanded clone (0.1-1%), V(D)J bias significant (chi-sq p < 0.01), partial epitope match	Clone at 0.5% with TRBV20-1 bias, similar CDR3 motif in VDJdb
Weak	Low-frequency expansion (0.01-0.1%), single timepoint only, no epitope database match	Moderately expanded clone without convergence or known specificity
Insufficient	Below detection threshold, sequencing depth < 10,000 clonotypes, no replication	Singleton clonotypes that may be PCR/sequencing artifacts

Interpretation Guidance

Clonality metrics: Shannon diversity measures overall repertoire complexity (higher = more diverse, typical range 5-12 for healthy blood). Gini coefficient ranges from 0 (perfectly even) to 1 (single dominant clone); values > 0.3 suggest clonal expansion. Clonality (1 - Pielou's evenness) > 0.2 indicates moderate clonal dominance; > 0.5 suggests strong oligoclonal expansion (common in active infection or tumor-infiltrating lymphocytes).
V(D)J usage significance: Biased V or J gene usage (chi-square p < 0.01 vs expected uniform distribution) may indicate antigen-driven selection. However, baseline V gene usage is not uniform even in healthy repertoires due to genomic proximity and recombination efficiency. Compare against healthy donor reference distributions rather than uniform expectation when possible.
CDR3 convergence meaning: Convergent recombination (same CDR3 amino acid from different nucleotide sequences) is strong evidence of antigen-driven selection because independent recombination events converged on the same receptor. Public clonotypes (shared across individuals) further strengthen this inference. A convergence ratio > 2 (nucleotide variants per amino acid sequence) for expanded clones is noteworthy.
Sequencing depth: Rarefaction curves that plateau indicate sufficient depth. If the curve is still rising, richness and diversity estimates are underestimates. Minimum recommended depth: 50,000-100,000 total reads for bulk TCR-seq.
Longitudinal tracking: Persistent clones across timepoints with stable or increasing frequency indicate antigen-driven maintenance. Transient expansions that disappear may reflect acute responses.

Synthesis Questions

Does the observed clonal expansion pattern (Gini coefficient, top-clone frequency) match the expected immune context (e.g., post-vaccination expansion, tumor-infiltrating lymphocyte oligoclonality)?
Are convergent CDR3 sequences found across multiple individuals in the cohort, suggesting a public response to a shared antigen?
Do expanded clonotypes show biased V gene usage consistent with known antigen-specific repertoire features (e.g., TRBV20-1 enrichment in CMV-specific responses)?
Is the sequencing depth sufficient (rarefaction plateau reached) to reliably estimate diversity metrics and detect low-frequency expanded clones?
For longitudinal data, do clonal dynamics (expansion, contraction, persistence) correlate with clinical outcomes or treatment response?

References

Dash P, et al. (2017) Quantifiable predictive features define epitope-specific T cell receptor repertoires. Nature
Glanville J, et al. (2017) Identifying specificity groups in the T cell receptor repertoire. Nature
Stubbington MJT, et al. (2016) T cell fate and clonality inference from single-cell transcriptomes. Nature Methods
Vander Heiden JA, et al. (2014) pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. Bioinformatics

tooluniverse-immune-repertoire-analysis

Install

Tool Access

Preview

Supporting Assets

SKILL.md

Similar Skills

tooluniverse-immune-repertoire-analysis

Install

Tool Access

Preview

Supporting Assets

SKILL.md

ToolUniverse Immune Repertoire Analysis

Domain Reasoning

LOOK UP DON'T GUESS

Overview

Core Workflow

Phase 1: Data Import & Clonotype Definition

Phase 2: Diversity & Clonality Analysis

Phase 3: V(D)J Gene Usage Analysis

Phase 4: CDR3 Sequence Analysis

Phase 5: Clonal Expansion Detection

Phase 6: Convergence & Public Clonotypes

Phase 7: Epitope Prediction & Specificity

Phase 8: Integration with Single-Cell Data

ToolUniverse Tool Integration

Quick Start

Reasoning Framework for Result Interpretation

Evidence Grading

Interpretation Guidance

Synthesis Questions

References

See Also

Similar Skills

ToolUniverse Immune Repertoire Analysis

Domain Reasoning

LOOK UP DON'T GUESS

Overview

Core Workflow

Phase 1: Data Import & Clonotype Definition

Phase 2: Diversity & Clonality Analysis

Phase 3: V(D)J Gene Usage Analysis

Phase 4: CDR3 Sequence Analysis

Phase 5: Clonal Expansion Detection

Phase 6: Convergence & Public Clonotypes

Phase 7: Epitope Prediction & Specificity

Phase 8: Integration with Single-Cell Data

ToolUniverse Tool Integration

Quick Start

Reasoning Framework for Result Interpretation

Evidence Grading

Interpretation Guidance

Synthesis Questions

References

See Also