Help us improve
Share bugs, ideas, or general feedback.
From superpowers
Downloads and queries COSMIC cancer mutation data including somatic mutations, Cancer Gene Census, mutational signatures, and gene fusions for cancer research and precision oncology.
npx claudepluginhub lunartech-x/superpowers --plugin superpowersHow this skill is triggered — by the user, by Claude, or both
Slash command
/superpowers:cosmicThe summary Claude sees in its skill listing — used to decide when to auto-load this skill
COSMIC (Catalogue of Somatic Mutations in Cancer) is the world's largest and most comprehensive database for exploring somatic mutations in human cancer. Access COSMIC's extensive collection of cancer genomics data, including millions of mutations across thousands of cancer types, curated gene lists, mutational signatures, and clinical annotations programmatically.
Downloads and queries COSMIC cancer mutation data including somatic mutations, Cancer Gene Census, mutational signatures, and gene fusions for cancer research and precision oncology.
Query COSMIC REST API v3.1 for somatic cancer mutations by gene/sample/variant, cancer gene census, mutational signatures, and drug resistance variants. Requires free registration.
Builds TCGA/GDC cancer cohorts, retrieves clinical data, profiles somatic mutations, runs survival analysis, and interprets variants with OncoKB.
Share bugs, ideas, or general feedback.
COSMIC (Catalogue of Somatic Mutations in Cancer) is the world's largest and most comprehensive database for exploring somatic mutations in human cancer. Access COSMIC's extensive collection of cancer genomics data, including millions of mutations across thousands of cancer types, curated gene lists, mutational signatures, and clinical annotations programmatically.
This skill should be used when:
COSMIC requires authentication for data downloads:
uv pip install requests pandas
Use the scripts/download_cosmic.py script to download COSMIC data files:
from scripts.download_cosmic import download_cosmic_file
# Download mutation data
download_cosmic_file(
email="your_email@institution.edu",
password="your_password",
filepath="GRCh38/cosmic/latest/CosmicMutantExport.tsv.gz",
output_filename="cosmic_mutations.tsv.gz"
)
# Download using shorthand data type
python scripts/download_cosmic.py user@email.com --data-type mutations
# Download specific file
python scripts/download_cosmic.py user@email.com \
--filepath GRCh38/cosmic/latest/cancer_gene_census.csv
# Download for specific genome assembly
python scripts/download_cosmic.py user@email.com \
--data-type gene_census --assembly GRCh37 -o cancer_genes.csv
import pandas as pd
# Read mutation data
mutations = pd.read_csv('cosmic_mutations.tsv.gz', sep='\t', compression='gzip')
# Read Cancer Gene Census
gene_census = pd.read_csv('cancer_gene_census.csv')
# Read VCF format
import pysam
vcf = pysam.VariantFile('CosmicCodingMuts.vcf.gz')
Download comprehensive mutation data including point mutations, indels, and genomic annotations.
Common data types:
mutations - Complete coding mutations (TSV format)mutations_vcf - Coding mutations in VCF formatsample_info - Sample metadata and tumor information# Download all coding mutations
download_cosmic_file(
email="user@email.com",
password="password",
filepath="GRCh38/cosmic/latest/CosmicMutantExport.tsv.gz"
)
Access the expert-curated list of ~700+ cancer genes with substantial evidence of cancer involvement.
# Download Cancer Gene Census
download_cosmic_file(
email="user@email.com",
password="password",
filepath="GRCh38/cosmic/latest/cancer_gene_census.csv"
)
Use cases:
Download signature profiles for mutational signature analysis.
# Download signature definitions
download_cosmic_file(
email="user@email.com",
password="password",
filepath="signatures/signatures.tsv"
)
Signature types:
Access gene fusion data and structural rearrangements.
Available data types:
structural_variants - Structural breakpointsfusion_genes - Gene fusion events# Download gene fusions
download_cosmic_file(
email="user@email.com",
password="password",
filepath="GRCh38/cosmic/latest/CosmicFusionExport.tsv.gz"
)
Retrieve copy number alterations and gene expression data.
Available data types:
copy_number - Copy number gains/lossesgene_expression - Over/under-expression data# Download copy number data
download_cosmic_file(
email="user@email.com",
password="password",
filepath="GRCh38/cosmic/latest/CosmicCompleteCNA.tsv.gz"
)
Access drug resistance mutation data with clinical annotations.
# Download resistance mutations
download_cosmic_file(
email="user@email.com",
password="password",
filepath="GRCh38/cosmic/latest/CosmicResistanceMutations.tsv.gz"
)
COSMIC provides data for two reference genomes:
Specify the assembly in file paths:
# GRCh38 (recommended)
filepath="GRCh38/cosmic/latest/CosmicMutantExport.tsv.gz"
# GRCh37 (legacy)
filepath="GRCh37/cosmic/latest/CosmicMutantExport.tsv.gz"
latest in file paths to always get the most recent releasev102, v101, etc.Filter mutations by gene:
import pandas as pd
mutations = pd.read_csv('cosmic_mutations.tsv.gz', sep='\t', compression='gzip')
tp53_mutations = mutations[mutations['Gene name'] == 'TP53']
Identify cancer genes by role:
gene_census = pd.read_csv('cancer_gene_census.csv')
oncogenes = gene_census[gene_census['Role in Cancer'].str.contains('oncogene', na=False)]
tumor_suppressors = gene_census[gene_census['Role in Cancer'].str.contains('TSG', na=False)]
Extract mutations by cancer type:
mutations = pd.read_csv('cosmic_mutations.tsv.gz', sep='\t', compression='gzip')
lung_mutations = mutations[mutations['Primary site'] == 'lung']
Work with VCF files:
import pysam
vcf = pysam.VariantFile('CosmicCodingMuts.vcf.gz')
for record in vcf.fetch('17', 7577000, 7579000): # TP53 region
print(record.id, record.ref, record.alts, record.info)
For comprehensive information about COSMIC data structure, available files, and field descriptions, see references/cosmic_data_reference.md. This reference includes:
Use this reference when:
The download script includes helper functions for common operations:
from scripts.download_cosmic import get_common_file_path
# Get path for mutations file
path = get_common_file_path('mutations', genome_assembly='GRCh38')
# Returns: 'GRCh38/cosmic/latest/CosmicMutantExport.tsv.gz'
# Get path for gene census
path = get_common_file_path('gene_census')
# Returns: 'GRCh38/cosmic/latest/cancer_gene_census.csv'
Available shortcuts:
mutations - Core coding mutationsmutations_vcf - VCF format mutationsgene_census - Cancer Gene Censusresistance_mutations - Drug resistance datastructural_variants - Structural variantsgene_expression - Expression datacopy_number - Copy number alterationsfusion_genes - Gene fusionssignatures - Mutational signaturessample_info - Sample metadatalatest for the most recent versionCOSMIC data integrates well with:
When using COSMIC data, cite: Tate JG, Bamford S, Jubb HC, et al. COSMIC: the Catalogue Of Somatic Mutations In Cancer. Nucleic Acids Research. 2019;47(D1):D941-D947.