Extract and decompress must-gather archives from Prow CI job artifacts, generating an interactive HTML file browser with filters
/plugin marketplace add openshift-eng/ai-helpers/plugin install prow-job@ai-helpersThis skill inherits all available tools. When active, it can use any tool Claude has access to.
CHANGELOG.mdREADME.mdextract_archives.pygenerate_html_report.pyThis skill extracts and decompresses must-gather archives from Prow CI job artifacts, automatically handling nested tar and gzip archives, and generating an interactive HTML file browser.
Use this skill when the user wants to:
Before starting, verify these prerequisites:
gcloud CLI Installation
which gcloudgcloud Authentication (Optional)
test-platform-results bucket is publicly accessibleThe user will provide:
test-platform-results/
https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview/1965715986610917376/Extract bucket path
test-platform-results/ in URLExtract build_id
/(\\d{10,})/ in the bucket pathExtract prowjob name
.../periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview/1965715986610917376/periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreviewConstruct GCS paths
test-platform-resultsgs://test-platform-results/{bucket-path}//Check for existing extraction first
.work/prow-job-extract-must-gather/{build_id}/logs/ directory exists and has contentrm -rf .work/prow-job-extract-must-gather/{build_id}/logs/rm -rf .work/prow-job-extract-must-gather/{build_id}/tmp/Create directory structure
mkdir -p .work/prow-job-extract-must-gather/{build_id}/logs
mkdir -p .work/prow-job-extract-must-gather/{build_id}/tmp
.work/prow-job-extract-must-gather/ as the base directory (already in .gitignore)logs/ subdirectory for extractiontmp/ subdirectory for temporary files.work/prow-job-extract-must-gather/{build_id}/Download prowjob.json
gcloud storage cp gs://test-platform-results/{bucket-path}/prowjob.json .work/prow-job-extract-must-gather/{build_id}/tmp/prowjob.json --no-user-output-enabled
Parse and validate
.work/prow-job-extract-must-gather/{build_id}/tmp/prowjob.json--target=([a-zA-Z0-9-]+)Extract target name
e2e-aws-ovn-techpreview)Construct must-gather path
gs://test-platform-results/{bucket-path}/artifacts/{target}/gather-must-gather/artifacts/must-gather.tar.work/prow-job-extract-must-gather/{build_id}/tmp/must-gather.tarDownload must-gather.tar
gcloud storage cp gs://test-platform-results/{bucket-path}/artifacts/{target}/gather-must-gather/artifacts/must-gather.tar .work/prow-job-extract-must-gather/{build_id}/tmp/must-gather.tar --no-user-output-enabled
--no-user-output-enabled to suppress progress outputIMPORTANT: Use the provided Python script extract_archives.py from the skill directory.
Usage:
python3 plugins/prow-job/skills/prow-job-extract-must-gather/extract_archives.py \
.work/prow-job-extract-must-gather/{build_id}/tmp/must-gather.tar \
.work/prow-job-extract-must-gather/{build_id}/logs
What the script does:
Extract must-gather.tar
{build_id}/logs/ directoryRename long subdirectory to "content/"
registry-build09-ci-openshift-org-ci-op-m8t77165-stable-sha256-d1ae126eed86a47fdbc8db0ad176bf078a5edebdbb0df180d73f02e5f03779e0/content/Recursively process nested archives
For .tar.gz and .tgz files:
# Extract in place
with tarfile.open(archive_path, 'r:gz') as tar:
tar.extractall(path=parent_dir)
# Remove original archive
os.remove(archive_path)
For .gz files (no tar):
# Gunzip in place
with gzip.open(gz_path, 'rb') as f_in:
with open(output_path, 'wb') as f_out:
shutil.copyfileobj(f_in, f_out)
# Remove original archive
os.remove(gz_path)
Progress reporting
Error handling
IMPORTANT: Use the provided Python script generate_html_report.py from the skill directory.
Usage:
python3 plugins/prow-job/skills/prow-job-extract-must-gather/generate_html_report.py \
.work/prow-job-extract-must-gather/{build_id}/logs \
"{prowjob_name}" \
"{build_id}" \
"{target}" \
"{gcsweb_url}"
Output: The script generates .work/prow-job-extract-must-gather/{build_id}/must-gather-browser.html
What the script does:
Scan directory tree
{build_id}/logs/ directoryClassify files
.log, .txt.yaml, .yml.json.xml.crt, .pem, .key.tar, .gz, .tgz, .tar.gzGenerate HTML structure
Header Section:
<div class="header">
<h1>Must-Gather File Browser</h1>
<div class="metadata">
<p><strong>Prow Job:</strong> {prowjob-name}</p>
<p><strong>Build ID:</strong> {build_id}</p>
<p><strong>gcsweb URL:</strong> <a href="{original-url}">{original-url}</a></p>
<p><strong>Target:</strong> {target}</p>
<p><strong>Total Files:</strong> {count}</p>
<p><strong>Total Size:</strong> {human-readable-size}</p>
</div>
</div>
Filter Controls:
<div class="filters">
<div class="filter-group">
<label class="filter-label">File Type (multi-select)</label>
<div class="filter-buttons">
<button class="filter-btn" data-filter="type" data-value="log">Logs ({count})</button>
<button class="filter-btn" data-filter="type" data-value="yaml">YAML ({count})</button>
<button class="filter-btn" data-filter="type" data-value="json">JSON ({count})</button>
<!-- etc -->
</div>
</div>
<div class="filter-group">
<label class="filter-label">Filter by Regex Pattern</label>
<input type="text" class="search-box" id="pattern" placeholder="Enter regex pattern (e.g., .*etcd.*, .*\\.log$)">
</div>
<div class="filter-group">
<label class="filter-label">Search by Name</label>
<input type="text" class="search-box" id="search" placeholder="Search file names...">
</div>
</div>
File List:
<div class="file-list">
<div class="file-item" data-type="{type}" data-path="{path}">
<div class="file-icon">{icon}</div>
<div class="file-info">
<div class="file-name">
<a href="{relative-path}" target="_blank">{filename}</a>
</div>
<div class="file-meta">
<span class="file-path">{directory-path}</span>
<span class="file-size">{size}</span>
<span class="file-type badge badge-{type}">{type}</span>
</div>
</div>
</div>
</div>
CSS Styling:
JavaScript Interactivity:
// Multi-select file type filters
document.querySelectorAll('.filter-btn').forEach(btn => {
btn.addEventListener('click', function() {
// Toggle active state
// Apply filters
});
});
// Regex pattern filter
document.getElementById('pattern').addEventListener('input', function() {
const pattern = this.value;
if (pattern) {
const regex = new RegExp(pattern);
// Filter files matching regex
}
});
// Name search filter
document.getElementById('search').addEventListener('input', function() {
const query = this.value.toLowerCase();
// Filter files by name substring
});
// Combine all active filters
function applyFilters() {
// Show/hide files based on all active filters
}
Statistics Section:
<div class="stats">
<div class="stat">
<div class="stat-value">{total-files}</div>
<div class="stat-label">Total Files</div>
</div>
<div class="stat">
<div class="stat-value">{total-size}</div>
<div class="stat-label">Total Size</div>
</div>
<div class="stat">
<div class="stat-value">{log-count}</div>
<div class="stat-label">Log Files</div>
</div>
<div class="stat">
<div class="stat-value">{yaml-count}</div>
<div class="stat-label">YAML Files</div>
</div>
<!-- etc -->
</div>
Write HTML to file
.work/prow-job-extract-must-gather/{build_id}/must-gather-browser.htmlDisplay summary
Must-Gather Extraction Complete
Prow Job: {prowjob-name}
Build ID: {build_id}
Target: {target}
Extraction Statistics:
- Total files: {file-count}
- Total size: {human-readable-size}
- Archives extracted: {archive-count}
- Log files: {log-count}
- YAML files: {yaml-count}
- JSON files: {json-count}
Extracted to: .work/prow-job-extract-must-gather/{build_id}/logs/
File browser generated: .work/prow-job-extract-must-gather/{build_id}/must-gather-browser.html
Open in browser to browse and search extracted files.
Open report in browser
xdg-open .work/prow-job-extract-must-gather/{build_id}/must-gather-browser.htmlopen .work/prow-job-extract-must-gather/{build_id}/must-gather-browser.htmlstart .work/prow-job-extract-must-gather/{build_id}/must-gather-browser.htmlxdg-openOffer next steps
.work/prow-job-extract-must-gather/{build_id}/logs/Handle these error scenarios gracefully:
Invalid URL format
Build ID not found
gcloud not installed
which gcloudprowjob.json not found
Not a ci-operator job
must-gather.tar not found
Corrupted archive
No "-ci-" subdirectory found
Avoid re-extracting
.work/prow-job-extract-must-gather/{build_id}/logs/ already has contentEfficient downloads
gcloud storage cp with --no-user-output-enabled to suppress verbose outputMemory efficiency
Progress indicators
User: "Extract must-gather from this Prow job: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-ci-4.20-e2e-aws-ovn-techpreview/1965715986610917376"
Output:
- Downloads must-gather.tar to: .work/prow-job-extract-must-gather/1965715986610917376/tmp/
- Extracts to: .work/prow-job-extract-must-gather/1965715986610917376/logs/
- Renames long subdirectory to: content/
- Processes 247 nested archives (.tar.gz, .tgz, .gz)
- Creates: .work/prow-job-extract-must-gather/1965715986610917376/must-gather-browser.html
- Opens browser with interactive file list (3,421 files, 234 MB)
.work/prow-job-extract-must-gather/{build_id}/ directory structure for organization.work/ which is already in .gitignore.work/prow-job-extract-must-gather/{build_id}/ to avoid re-extractionArchive Processing:
Directory Renaming:
File Type Detection:
Regex Pattern Filtering:
Working with Scripts:
plugins/prow-job/skills/prow-job-extract-must-gather/extract_archives.py - Extracts and processes archivesgenerate_html_report.py - Generates interactive HTML file browserCreating algorithmic art using p5.js with seeded randomness and interactive parameter exploration. Use this when users request creating art using code, generative art, algorithmic art, flow fields, or particle systems. Create original algorithmic art rather than copying existing artists' work to avoid copyright violations.
Applies Anthropic's official brand colors and typography to any sort of artifact that may benefit from having Anthropic's look-and-feel. Use it when brand colors or style guidelines, visual formatting, or company design standards apply.
Create beautiful visual art in .png and .pdf documents using design philosophy. You should use this skill when the user asks to create a poster, piece of art, design, or other static piece. Create original visual designs, never copying existing artists' work to avoid copyright violations.